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M RP-Related ABC Transporte r 
Encoding Nucleic Acids and Methods of Use Thereof 



Pursuant to 35 U.S.C. §202 (c) it is acknowledged that 
the U.S. Government has certain rights in the invention 
described herein, which was made in part with funds from 
the National Institutes of Health, Grant Numbers, CA63173 
and CA06927. 

FIELD OF THE INVENTION 

The present invention relates to the fields of 
medicine and molecular biology. More specifically, the 
invention provides nucleic acid molecules and proteins 
encoded thereby which are involved in the development of 
resistance to pharmacological and chemotherapeut ic agents 
in tumor cells, 

BACKGROUND OF THE INVENTION 

Several publications are referenced in this 
application in parentheses in order to more fully describe 
the state of the art to which this invention pertains. 
The disclosure of each of these publications is 
incorporated by reference herein. 

P-glycoprotein, the product of the MDR1 gene, was the 
first ABC transporter shown to confer resistance to 
cytotoxic agents. Pgp functions as an ATP-dependent 
efflux pump that reduces the intracellular concentration 
of a variety of chemotherapeut ic agents by transporting 
them across the plasma membrane (1) . The multidrug 
resistance phenotype associated with overexpression of Pgp 
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is of considerable clinical interest because natural 
product drugs are second only to alkylating agents in 
clinical utility, and many effective chemotherapeutic 
regimens contain more than one natural product agent . 
More recently, we and others have reported transfection 
studies indicating that MRP, another ABC family 
transporter, confers a multidrug resistance phenotype that 
includes many natural product drugs, but is distinct from 
the resistance phenotype associated with Pgp (2-6) . MRP 
shares only limited amino acid identity with Pgp, and this 
is reflected in the different substrate specificities of 
the two transporters. In contrast to Pgp, MRP can 
transport a wide range of anionic organic conjugates, 
including glutathione S-conjugates (7) . In addition to 
Pgp and MRP there may be other transporters that are 
involved in cytotoxic drug resistance. In the case of 
natural product drugs, resistant cell lines have been 
described that display a multidrug resistant phenotype 
associated with a drug accumulation deficit, but do not 
overexpress Pgp or MRP (8) . ABC transporters have also 
been linked to cisplatin resistance, and several lines of 
evidence suggest the possibility that pumps specific for 
organic anions may be involved: 1) decreased cisplatin 
accumulation is consistently observed in cisplatin 
resistant cell lines (9); 2) cisplatin is conjugated to 
glutathione in the cell, and this anionic conjugate is 
toxic in an in vitro biochemical assay (10) ; and 3) 
biochemical studies using membrane vesicle preparations 
have shown that cisplatin resistant cells lines have 
enhanced expression of an ATP-dependent transporter of 
CDDP-glutathione and other glutathione S-conjugates such 
as the cystinyl leukotriene LTC 4 (11, 12) . These data thus 
suggest that an organic anion transporter may contribute 
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to cisplatin resistance by exporting CDDP-glutathione . 
While MRP is an organic anion transporter, the reported 
drug resistance profile of MRP- trans fee ted cells does not 
extend to this agent (5, 6) , and to date only one 
cisplatin resistant cell line has been reported to 
overexpress MRP (13) . This suggests that organic anion 
transporters other than MRP may contribute to cisplatin 
resistance. Consistent with this possibility, the 
canalicular mul t ispecif ic organic anion transporter, 
cMOAT , an MRP- related transporter that functions as the 
major organic anion transporter in liver, has been 
reported to be overexpressed in cisplatin resistant cell 
lines (14, 15) . A more direct link between cMOAT and 
cytotoxic drug resistance is suggested by a recent report 
in which transfection of a cMOAT antisense construct into 
a liver cancer cell line resulted in sensitization to 
cisplatin, daunorubicin and other cytotoxic agents (16) . 

Clearly, a need exists for identifying the essential 
components and mechanisms giving rise to drug resistance 
and the transport of anticancer agents out of the tumor 
cell. The elucidation of these mechanisms may be used to 
advantage for the design of efficacious chemotherapeut ic 
agents . 

SUMMARY OF THE INVENTION 

This invention provides novel, biological molecules 
useful for identification, detection, and/or molecular 
characterization of components involved in the acquisition 
of drug resistance in tumor cells. According to one 
aspect of the invention, an isolated nucleic acid molecule 
is provided which includes a sequence encoding a protein 
transporter of a size between about 1300 and 1350 amino 
acids in length. The encoded protein, referred to herein 
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as MOAT - B , comprises a multi- domain structure including a 
tandem repeat of nucleotide binding folds appended 
C- terminal to a hydrophobic domain that contains several 
potential membrane spanning helices. Conserved Walker A 
and B ATP binding sites are present in each of the 
nucleotide binding folds. 

In a preferred embodiment of the invention, an 
isolated nucleic acid molecule is provided that includes a 
cDNA encoding a human MOAT - B protein. In a particularly 
preferred embodiment, the human MOAT - B protein has an 
amino acid sequence the same as Sequence I.D. No. 2, An 
exemplary MOAT-B nucleic acid molecule of the invention 
comprises Sequence I.D. No. 1. 

According to another aspect of the invention, a 
second isolated nucleic acid molecule is provided which 
includes a sequence encoding a transporter between about 
1400 and 1450 amino acids. The encoded protein, referred 
to herein as MOAT-C contains a multi-domain structure 
including a tandem repeat of nucleotide binding folds 
appended C- terminal to a hydrophobic domain that contains 
several potential membrane spanning helices. Conserved 
Walker A and B ATP binding sites are present in each of 
the nucleotide binding folds. While similar in structure 
to MOAT-B described above, MOAT-C contains distinct 
sequence differences . 

In a preferred embodiment of the invention, an 
isolated nucleic acid molecule is provided that includes a 
cDNA encoding a human MOAT-C protein. In a particularly 
preferred embodiment, the human MOAT-C protein has an 
amino acid sequence the same as Sequence I.D. No. 4. An 
exemplary MOAT-C nucleic acid molecule of the invention 
comprises Sequence I.D. No. 3. 

According to yet another aspect of the invention, an 
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isolated nucleic acid molecule is provided which includes 
a sequence encoding a protein of a size between about 1500 
and 1550 amino acids in length. The encoded protein, 
referred to herein as MOAT - D , contains a multi domain 
structure including an N-terminal hydrophobic extension 
which harbors five transmembrane spanning helices. 

In a preferred embodiment of the invention, an 
isolated nucleic acid molecule is provided that includes a 
cDNA encoding a MOAT - D protein. In a particularly 
preferred embodiment, the human MOAT - D protein has an 
amino acid sequence the same as Sequence I.D. No. 6. An 
exemplary MOAT - D nucleic acid molecule of the invention 
comprises Sequence I.D. No. 5. 

According to yet another aspect of the invention, an 
isolated nucleic acid molecule is provided which includes 
a sequence encoding a protein of a size between about 1480 
and 153 0 amino acids in length. The encoded protein, 
referred to herein as MOAT-E, contains a multidomain 
structure including an N-terminal hydrophobic extension 
'which harbors several transmembrane spanning helices. 
While similar in structure to MOAT - D described above, 
MOAT-E contains distinct sequence differences. 

In a preferred embodiment of the invention, an 
isolated nucleic acid molecule is provided that includes a 
cDNA encoding a MOAT-E protein. In a particularly 
preferred embodiment, the human MOAT-E protein has an 
amino acid sequence the same as Sequence I.D. No. 8. An 
exemplary MOAT-E nucleic acid molecule of the invention 
comprises Sequence I.D. No. 7. 

According to another aspect of the present invention, 
an isolated nucleic acid molecule is provided, which has a 
sequence selected from the group consisting of: (1) 
Sequence I.D. No. 1; (2) a sequence specifically 
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hybridizing with preselected portions or all of the 
complementary strand of Sequence I.D. No. 1 comprising 
nucleic acids encoding amino acids 1-1154 of Sequence ID 
No. 2; (3) a sequence encoding preselected portions of 
Sequence I.D. No. 1 within nucleotides 1-3462, (4) 
Sequence I.D. No. 3; (5) a sequence specifically 
hybridizing with preselected portions or all of the 
complementary strand of Sequence I.D. No. 3 comprising 
nucleic acids encoding amino acids 1-442 of Sequence ID 
No. 4; (6) a sequence encoding preselected portions of 
Sequence I.D. No. 3 within nucleotides 1-1326, (7) 
Sequence I.D. No. 5; (8) a sequence specifically 
hybridizing with preselected portions or all of the 
complementary strand of Sequence I.D. No. 5 comprising 
nucleic acids encoding amino acids 1-1036 of Sequence ID 
No. 6; (9) a sequence encoding preselected portions of 
Sequence I.D. No. 5 within nucleotides 1-3108, (1) 
Sequence I.D. No. 7; (2) a sequence specifically 
hybridizing with preselected portions or all of the 
complementary strand of Sequence I.D. No. 7 comprising 
nucleic acids encoding amino acids 1-998 of Sequence ID 
No. 8; (3) a sequence encoding preselected portions of 
Sequence I.D. No. 7 within nucleotides 1-300. 

Such partial sequences are useful as probes to 
identify and isolate homologues of the MOAT genes of the 
invention. Additionally, isolated nucleic acid sequences 
encoding natural allelic variants of the nucleic acids of 
Sequence I.D. Nos . , 1, 3, 5 and 7 are also contemplated to 
be within the scope of the present invention. The term 
natural allelic variants will be defined hereinbelow. 

According to another aspect of the present invention, 
antibodies immunologically specific for the human MOAT 
proteins described hereinabove are provided. 
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In yet another aspect of the invention, host cells 
comprising at least one of the MOAT encoding nucleic acids 
are provided. Such host cells include but are not limited 
to bacterial cells, fungal cells, insect cells, mammalian 
cells , and plant cells . Host cells overexpressing one 
or more of the MOAT encoding nucleic acids of the 
invention provide valuable research tools for assessing 
transport of chemotherapeutic agents out of cells. 
MOAT expressing cells also comprise a biological system 
useful in methods for identifying inhibitors of the MOAT 
transporters . 

Another embodiment of the present invention 
encompasses methods for screening cells expressing MOAT 
encoding nucleic acids for chemotherapy resistance. Such 
methods will provide the clinician with data which 
correlates expression of a particular MOAT genes with a 
particular chemotherapy resistant phenotype . 

Diagnostic methods are also contemplated in the 
present invention. Accordingly, suitable oligonucleotide 
probes are provided which hybrid! ze to the nucleic acids 
of the invention. Such probes may be used to advantage in 
screening biopsy samples for the expression of particular 
MOAT genes. Once a tumor sample has been characterized as 
to the MOAT gene ( s ) expressed therein , inhibitors 
identified in the cell line screening methods described 
above may be administered to prevent efflux of the 
beneficial chemotherapeutic agents from cancer cells. 

The methods of the invention may be applied to kits. 
An exemplary kit of the invention comprises MOAT gene 
specific oligonucleotide probes and/or primers, MOAT 
encoding DNA molecules for use as a positive control , 
buffers, and an instruction sheet. A kit for practicing 
the cell line screening method includes frozen cells 
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comprising the MOAT genes of the invention, suitable 
culture media, buffers and an instruction sheet. 

In a further aspect of the invention, transgenic 
knockout mice are disclosed- Mice will be generated in 
which at least one MOAT gene has been knocked out. Such 
mice will provide a valuable in biological system for 
assessing resistance to chemotherapy in an in vivo tumor 
model . ' 

Various terms relating to the biological molecules of 
the present invention are used hereinabove and also 
throughout the specification and claims. The terms 
"percent similarity" and "percent identity (identical)" 
are used as set forth in the UW GCG Sequence Analysis 
program (Devereux et al . NAR 12:387-397 (1984)). 

With reference to nucleic acids of the invention, the 
term "isolated nucleic acid" is sometimes used. This 
term, when applied to DNA, refers to a DNA molecule that 
is separated from sequences with which it is immediately 
contiguous (in the 5' and 3 1 directions) in the naturally 
occurring genome of the organism from which it originates . 
For example, the "isolated nucleic acid" may comprise a 
DNA or cDNA molecule inserted into a vector, such as a 
plasmid or virus vector, or integrated into the genomic 
DNA of a prokaryote or eukaryote. 

With respect to RNA molecules of the invention, the 
term "isolated nucleic acid" primarily refers to an RNA 
molecule encoded by an isolated DNA molecule as defined 
above. Alternatively, the term may refer to an RNA 
molecule that has been sufficiently separated from RNA 
molecules with which it would be associated in its natural 
state (i.e., in cells or tissues), such that it exists in 
a "substantially pure" form (the term "substantially pure 11 
is defined below) . 
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With respect to protein, the term " isolated protein" 
or "isolated and purified protein" is sometimes used 
herein. This term refers primarily to a protein produced 
by expression of an isolated nucleic acid molecule of the 
invention- Alternatively, this term may refer to a 
protein which has been sufficiently separated from other 
proteins with which it would naturally be associated, so 
as to exist in "substantially pure" form. 

The term "substantially pure" refers to a preparation 
comprising at least 50-60% by weight the compound of 
interest (e.g., nucleic acid, oligonucleotide, protein, 
etc.). More preferably, the preparation comprises at 
least 75% by weight, and most preferably 90-99% by weight, 
the compound of interest. Purity is measured by methods 
appropriate for the compound of interest (e.g. 
chromatographic methods, agarose or polyacryl amide gel 
electrophoresis, HPLC analysis, and the like). With 
respect to antibodies of the invention, the term 
"immunologically specific" refers to antibodies that bind 
to one or more epitopes of a protein of interest (e.g., 
MOAT - B , MOAT-C or MOAT-D) , but which do not substantially 
recognize and bind other molecules in a sample containing 
a mixed population of antigenic biological molecules. 

With respect to nucleic acids and oligonucleotides, 
the term "specifically hybridizing" refers to the 
association between two single- stranded nucleotide 
molecules of sufficiently complementary sequence to permit 
such hybridization under pre -determined conditions 
generally used in the art (sometimes termed "substantially 
complementary") . When used in reference to a double 
stranded nucleic acid, this term is intended to signify 
that the double stranded nucleic acid has been subjected 
to denaturing conditions, as is well known to those of 
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skill in the art. In particular, the term refers to 
hybridization of an oligonucleotide with a substantially 
complementary sequence contained within a single-stranded 
DNA or RNA molecule of the invention, to the substantial 
exclusion of hybridization of the oligonucleotide with 
single-stranded nucleic acids of non- complementary 
sequence . 

One common formula for calculating the stringency 
conditions required to achieve hybridization between 
nucleic acid molecules of a specified sequence homology 
(Sambrook et al . , 1989): 

T m = 81.5°C + 16.6Log [Na + ] + 0.41 (% G+C) - 0.63 (% formamide) - 
600/#bp in duplex 

As an illustration of the above formula, using [Na+] 
= [0.368] and 50% formamide, with GC content of 42% and an 
average probe size of 200 bases, the T m is 57°C. The T m of 
a DNA duplex decreases by 1 - 1.5°C with every 1% decrease 
in homology. Thus, targets with greater than about 75% 
sequence identity would be observed using a hybridization 
temperature of 42 °C. Such sequences would be considered 
substantially homologous to the nucleic acid sequences of 
the invention . 

The nucleic acids, proteins, antibodies, cell lines, 
methods, and kits of the present invention may be used to 
advantage to identify targets for the development of novel 
agents which inhibit the aberrant transport of cytoxic 
agents out of tumor cells. The transgenic mice of the 
invention may be used an in vivo model for chemotherapy 
resistance . 

The human MOAT molecules methods and kits described 
above may also be used as research tool s and will 
facilitate the elucidation of the mechanism by which tumor 

10 



WO 99/49735 

cells acquire a drug resistant phenotype. 



PCT/US99/06644 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the predicted structure of MOAT - B and 
comparison with human MRP. The vertical lines indicate 
identical amino acids and the vertical dots indicate 
conserved amino acids . Gaps are indicated by periods . 
The overbars indicate potential transmembrane spanning 
segments as predicted by the TMAP program. The first and 
second nucleotide binding folds (NBF 1 and NBF 2) are 
indicated by horizontal arrows. The C-terminal 34 amino 
acids (residues 1291 - 1325) are replaced in the second 
class of MOAT - B cDNA clones by the following amino acids: 
ILQKKLSTYWSH. The Alignment was performed using the GAP 
program (gap weight 3.0, length weight 0.1) in the 
Genetics Computer Group Package. H. MRP: human MRP. 

Figures 2A and 2B depict a comparison of the 
nucleotide binding folds and hydropathy profile of MOAT - B 
with those of other eukaryotic ABC transporters. Fig. 1A 
shows the comparison of the nucleotide binding folds of 
MOAT - B . Amino acids that are identical to those of MOAT-B 
are shaded, and gaps are indicated by periods. Walker A 
and B motifs, and the ABC transporter family signature 
sequence C, are underlined. Amino acid positions are 
indicated to the right. Amino acid sequences were aligned 
using the PIL.EUP program (gap weight 3.0, length weight 
0.1) in the Genetics Computer Group Package. Fig. IB 
shows a comparison of the MOAT-B hydropathy profile. To 
facilitate comparison, the proteins are aligned so that 
the N-terminal nucleotide binding folds (NBF) are roughly 
in register. NBF 1 s are indicated by bars. Values above 
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and below the horizontal lines indicate hydrophobic and 
hydrophilic regions, respectively. Hydrophobic! ty plots 
were generated using the Kyte-Doolittle algorithm with a 
window of 7 residues . The transporters shown are : human 
multidrug-associated protein, H. MRP (P33529) ; human 
multispecif ic organic anion transporter, H. MOAT (U63970); 
Saccharomyces cerevisiae yeast cadmium factor 1, S. YCF1 

(P39109) ; rat sulfonylurea receptor, R. SUR (Q09427) ; 
human cystic fibrosis transmembrane conductance regulator, 
H. CFTR (M28668) ; Leishmania P-glycoprotein , L. PgpA 

(P21441) and human mdrl gene product, H. MDR1 (P08183). 
Accession numbers are shown in parentheses. 

Figure 3 is a Northern blot showing the tissue 
distribution of MOAT - B transcript. Membranes containing 
poly (A) + RNA prepared from human tissues were hybridized 
with a radiolabeled MOAT - B or GAPDH probe. Top panels 
show MOAT-B transcript and bottom panels show the control 
GAPDH transcript. Arrows indicate the position of MOAT-B 
transcript. Prolonged exposure of the film revealed a low 
level signal in liver. 

Figure 4 shows the chromosomal localization of the 
gene encoding MOAT-B. Human metaphase spreads were 
hybridized with a biot in- labeled MOAT-B cDNA probe and 
detected by "FITC-con j ugated avidin. Hybridization signals 
at chromosome 13q3 2 in two metaphase spreads are indicated 
by arrows. The inset shows paired hybridization signals 
at band q32 of chromosome 13 from three other metaphase 
spreads . 

Figures 5A and 5B show the predicted structures of 
MOAT-C and MOAT-D. Fig. 5A presents the structure of 
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MOAT-C. Fig. 5B shows the structure of MOAT - D . Numbered 
overbars indicate potential transmembrane spanning 
helices. Horizontal arrows indicate the positions of the 
amino terminal (NBF1) and C- terminal (NBF2) nucleotide 
binding folds. Walker A and B motifs, and the ABC 
transporter family signature sequence C are underlined. 
Bullets indicate the positions of potential N-linked 
glycosylat ion sites that are conserved with previously 
reported N-glycosylat ion sites in MRP. The indicated 
MOAT-C transmembrane spanning helices were predicted using 
the TMAP program and an input alignment of MOAT-B and 
MOAT-C. The indicated MOAT-D transmembrane helices are 
based upon inspection of an alignment with MRP. 

Figures 6A and 6B show a comparison of the nucleotide 
binding folds and hydropathy profiles of MOAT-C and MOAT-D 
with those of other related ABC transporters. Fig. 6A 
depicts the comparison of the nucleotide binding folds. 
The alignment was produced using the PILEUP command (gap 
weight 3.0, length weight 0.1) in the Genetics Computer 
Group Package Version 9.1. Amino acid positions conserved 
in at least 4 of the 8 proteins are shaded. Periods 
indicate gaps in the alignment. Walker A and B , and the 
ABC transporter family signature sequence C are indicated 
by underbars . Fig. 6A shows the comparison of hydropathy 
profiles. To facilitate comparisons, gaps were introduced 
at the N- termini of some proteins in order to bring the 
first nucleotide binding folds into register. Nucleotide 
binding folds are indicated by bars . Values above and 
below the horizontal lines indicate hydrophobic and 
hydrophilic regions, respectively. Hydrophobic! ty plots 
were generated using the Kyte-Doolittle algorithm with a 
window of 7 residues. Accession numbers are as follows: 
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MRP, P33529; cMOAT , U63970, SUR , Q09428; CFTR , P-13569; 
MDR1 , P08183. 



Figure 7 is a Northern blot showing the tissue 
distribution of MOAT-C and MOAT - D transcripts. Blots 
containing poly A+ RNA prepared from various human tissues 
were hybridized with MOAT-C, MOAT-D and actin probes. 
Arrows indicate the position of the MOAT-C (top panel) and 
MOAT-D (middle panel) transcripts. The bottom panel shows 
the control actin transcript. 

Figures 8A and 8B show the chromosomal localization 
of the MOAT-C and MOAT-D genes. Human metaphase spreads 
were hybridized with a biot in- labeled MOAT-C and MOAT-D 
cDNA probes and detected by FITC-conjugated avidin. Fig. 
8A shows the localization of MOAT-C. Hybridization 
signals at chromosome 3q2 7 in two metaphase spreads are 
indicated by arrows (top) . The inset shows paired 
hybridization signals at band q27 of chromosome 3 from 
three other metaphase spreads . Fig . 8B shows the 
localization of MOAT-D . Hybridization signals at 
chromosome 17q21-22 in two metaphase spreads are indicated 
by arrows (top) . The inset shows paired hybridization 
signals at band q21-22 of chromosome 17 from three other 
metaphase spreads . 

Figure 9 shows predicted amino acid sequence of MOAT- 
ED Also shown are the location of the potential 
transmembrane helices (overbars) , the potential N- 
glycosylat ion site (black dot) and the two nucleotide 
binding folds (NBF1 and NBF2 ) . Walker A and B motifs, as 
well as the signature C motif of ABC transporters, are 
also indicated. 
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Figure 10 shows a comparison of the hydropathy 
profile of MOAT-E with other members of the MRP - cMOAT 
subfamily. The profile reveals that MOAT-E has a 
hydrophobic N- terminal segment which is absent in MOAT - B 
and MOAT-C. 

Figure 11 is a RNA blot which reveals that MOAT-E is 
expressed only in the liver and the kidney, suggesting 
that MOAT-E may participate in the excretion of substances 
into urine and bile. The lower panel shows hybridization 
of an actin probe to assess RNA loading. 

Figures 12A-12J show the cDNA (SEQ ID NO : 1) and 
amino acid sequences ( SEQ ID NO: 2 ) encoded by MOATB . 

Figures 13A-13K show the cDNA (SEQ ID NO : 3) and 
amino acid sequences (SEQ ID NO: 4) encoded by MOATC . 

Figures 14A-14K show the cDNA (SEQ ID NO: 5) and 
amino acid sequences (SEQ ID NO: 6) encoded by MOATD . 

Figures 15A-15K show the cDNA (SEQ ID NO: 7) and 
amino acid sequences (SEQ ID NO: 8) encoded by MOATE . 



DETAILED DESCRIPTION OF THE INVENTION 

MRP and cMOAT are closely related mammalian ABC 
transporters that export organic anions from cells. 
Transfection studies have established that MRP confers 
resistance to natural product cytotoxic agents, and recent 
evidence suggests the possibility that cMOAT may 
contribute to cytotoxic drug resistance as well. Based 
upon the potential importance of these transporters in 
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clinical drug resistance, and their important 
physiological roles in the export of the amphiphil ic 
products of phase I and phase II metabolism, we sought to 
identify other MRP-related transporters. Using a 
degenerate PCR approach, a cDNA molecule was isolated 
which encodes a novel ABC transporter designated herein as 
MOAT - B . The MOAT-B gene was mapped using fluorescence in 
situ hybridization to chromosome band 13q32. Comparison 
of the MOAT-B predicted protein with other transporters 
revealed that it is most closely related to MRP , cMOAT , 
and the yeast organic anion transporter YCF1 . While 
MOAT-B is closely related to these transporters, it is 
distinguished by the absence of approximately 200 amino 
acid N-terminal hydrophobic extension that is present in 
MRP and cMOAT, and which is predicted to encode several 
transmembrane spanning segments. In addition, the MOAT-B 
tissue distribution is distinct from MRP and cMOAT . In 
contrast to MRP, which is widely expressed in most 
tissues, including liver, and cMOAT , whose expression is 
largely restricted to liver, the MOAT-B transcript is 
widely expressed, with particularly high levels in 
prostate, but is barely detectable in liver. These data 
indicate that MOAT-B is a ubiquitously expressed 
transporter that is closely related to MRP and cMOAT, and 
indicate that it is an organic anion pump relevant to 
cellular detoxification . 

Three additional MRP /cMOAT- related transporters, 
MOAT-C, MOAT-D and MOAT - E are also disclosed herein. 
MOAT-C encodes a 1437 amino acid protein that is most 
closely related to MRP, cMOAT and MOAT-B, among eukaryotic 
transporters (33% - 3 7% identity) . However, based upon 
amino acid identity, MOAT-C is considerably less related 
to MRP and cMOAT than the latter transporters are to each 
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other (48% identity) . In addition, the MOAT-C topology is 
distinct from that of MRP and cMOAT in that it, like 
MOAT - B , lacks an N-terminal transmembrane spanning domain. 
MOAT-D encodes a 1530 amino acid transporter that is 
highly related to MRP (57% identity) and cMOAT (47% 
identity) . MOAT-E encodes 1503 amino acid transporter 
that is highly related to MOAT-D, MRP and cMOAT (39-45% 
identity) . The topology of MOAT-D and MOAT-E are quite 
similar to MRP and cMOAT, in that they have an N-terminal 
hydrophobic extension that is predicted to harbor five 
transmembrane spanning helices. MOAT-C and MOAT-D were 
mapped to chromosome bands 3q2 7 and 17q21-22, 
respectively, by fluorescence in situ hybridization. 

The expression patterns of MOAT-C, MOAT-D and MOAT-E 
are distinct from those of MRP, cMOAT and MOAT-B. MOAT-C 
transcript is widely expressed, with highest levels in 
skeletal muscle, kidney and testis, but is expressed at 
barely detectable levels in liver and lung. MOAT-D 
transcript has a more restricted expression pattern, with 
high levels in colon, pancreas, liver and kidney. Data 
presented herein reveal that MOAT-E expression is 
restricted to liver and kidney. 

Based upon degree of amino acid identity, and protein 
topology, the MRP-related transporters fall into two 
groups, with the first group consisting of MRP, cMOAT, 
MOAT-D and MOAT-E, and the second group consisting of 
MOAT-B and MOAT-C. The isolation of MOAT-C, MOAT-D and 
MOAT-E thus helps to define the MRP/cMOAT subfamily. The 
high degree of amino acid identity and topological 
similarity of MOAT-D and MOAT-E to MRP and cMOAT suggest 
that they function as organic anion transporters, and play 
a role in cytotoxic drug resistance. In contrast, the 
lower degree of amino acid identify and distinct topology 
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of MOAT - B and MOAT-C suggest the possibility that their 
substrate specificities and functions may be distinct from 
that of MRP, CMOAT, MOAT-D and MOAT - E . 

The compositions, methods, kits and transgenic mice 
of the invention disclosed herein will facilitate the 
identification of drugs that cripple the ability of MOAT 
genes and proteins encoded thereby to effect the efflux of 
clinically beneficial pharmacological agents in malignant 
cells . 



I. Preparation of MOAT -Encoding Nucleic Acid Molecules, 
MOAT Proteins, and Antibodies Thereto 
A. Nucleic Acid Molecules 

Nucleic acid molecules encoding the MOAT proteins of 
the invention may be prepared by two general methods: (1) 
synthesis from appropriate nucleotide triphosphates, or 
(2) isolation from biological sources. Both methods 
utilize protocols well known in the art. The availability 
of nucleotide sequence information, such as cDNAs having 
Sequence I.D. Nos. 1, 3, 5, or 7 enables preparation of an 
isolated nucleic acid molecule of the invention by 
oligonucleotide synthesis. Synthetic oligonucleotides may 
be prepared by the phosphoramidi te method employed in the 
Applied Biosystems 38A DNA Synthesizer or similar devices. 
The resultant construct may be purified according to 
methods known in the art, such as high performance liquid 
chromatography (HPLC) . Long, double -stranded 
polynucleotides, such as a DNA molecule of the present 
invention, must be synthesized in stages, due to the size 
limitations inherent in current oligonucleotide synthetic 
methods. Thus, for example, a 5 kb double -stranded 
molecule may be synthesized as several smaller segments of 
appropriate complementarity. Complementary segments thus 



18 



O ilibi " li /" JL H >lu ! O 

WO 99/49735 PCT/US99/06644 

produced may be annealed such that each segment possesses 
appropriate cohesive termini for attachment of an adjacent 
segment. Adjacent segments may be ligated by annealing 
cohesive termini in the presence of DNA ligase to 
construct an entire 5 kb double-stranded molecule. A 
synthetic DNA molecule so constructed may then be cloned 
and amplified in an appropriate vector. 

Nucleic acid sequences encoding the MOAT proteins of 
the invention may be isolated from appropriate biological 
sources using methods known in the art . In a preferred 
embodiment, a cDNA clone is isolated from a cDNA 
expression library of human origin. In an alternative 
embodiment, utilizing the sequence information provided by 
the cDNA sequence, human genomic clones encoding MOAT 
proteins may be isolated. Alternatively, cDNA or genomic 
clones having homology with MOAT-B, MOAT-C, MOAT - D or 
MOAT - E may be isolated from other species using 
oligonucleotide probes corresponding to predetermined 
sequences within the MOAT encoding nucleic acids. 

In accordance with the present invention, nucleic 
acids having the appropriate level of sequence homology 
with the protein coding region of Sequence I.D. Nos . 1, 3, 
5, and 7 - may be identified by using hybridization and 
washing conditions of appropriate stringency. For 
example, hybridizations may be performed, according to the 
method of Sambrook et al. , (supra) using a hybridization 
solution comprising: 5X SSC, 5X Denhardt 1 s reagent, 1.0% 
SDS, 100 fxg/ml denatured, fragmented salmon sperm DNA, 
0.05% sodium pyrophosphate and up to 50% f ormamide . 
Hybridization is carried out at 37-42°C for at least six 
hours. Following hybridization, filters are washed as 
follows: (1) 5 minutes at room temperature in 2X SSC and 
1% SDS; (2) 15 minutes at room temperature in 2X SSC and 
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0.1% SDS; (3) 30 minutes-1 hour at 37°C in IX SSC and 1% 
SDS; (4) 2 hours at 42-65°in IX SSC and 1% SDS, changing 
the solution every 30 minutes. 

Nucleic acids of the present invention may be 
maintained as DNA in any convenient cloning vector. In a 
preferred embodiment, clones are maintained in a plasmid 
cloning/expression vector, such as pBluescript 
(Stratagene, La Jolla, CA) , which is propagated in a 
suitable E . coli host cell. 

MOAT- encoding nucleic acid molecules of the invention 
include cDNA, genomic DNA , RNA, and fragments thereof 
which may be single- or double- stranded . Thus, this 
invention provides oligonucleotides (sense or antisense 
strands of DNA or RNA) having sequences capable of 
hybridizing with at least one sequence of a nucleic acid 
molecule of the present invention, such as selected 
segments of the cDNA having Sequence I.D. No. 1. Such 
oligonucleotides are useful as probes for detecting or 
isolating MOAT genes. Antisense nucleic acid molecules 
may be targeted to translation initiation sites and/or 
splice sites to inhibit the translation of the 
MOAT-encoding nucleic acids of the invention. Such 
antisense molecules are typically between 15 and 30 
nucleotides and length and often span the translat ional 
start site of MOAT encoding mRNA molecules . 

It will be appreciated by persons skilled in the art 
that variants of these sequences exist in the human 
population, and must be taken into account when designing 
and/or utilizing oligos of the invention. Accordingly, it 
is within the scope of the present invention to encompass 
such variants, with respect to the MOAT sequences 
disclosed herein or the oligos targeted to specific 
locations on the respective genes or RJSTA transcripts. 
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With respect to the inclusion of such variants, the term 
"natural allelic variants" is used herein to refer to 
various specific nucleotide sequences and variants thereof 
that would occur in a human population. The usage of 
different wobble codons and genetic polymorphisms which 
give rise to conservative or neutral amino acid 
substitutions in the encoded protein are examples of such 
variants. Additionally, the term "substantially 
complementary" refers to oligo sequences that may not be 
perfectly matched to a target sequence, but the mismatches 
do not materially affect the ability of the oligo to 
hybridize with its target sequence under the conditions 
described . 

B . Proteins 

Full-length MOAT-B, MOAT-C, MOAT - D and MOAT - E 
proteins of the present invention may be prepared in a 
variety of ways, according to known methods. The proteins 
may be purified from appropriate sources, e.g., 
transformed bacterial or animal cultured cells or tissues, 
by immunoaf f inity purification. However, this is not a 
preferred method due to the low amount of protein likely 
to be present in a given cell type at any time. The 
availability of nucleic acid molecules encoding MOAT 
proteins enables production of the proteins using in vitro 
expression methods known in the art. For example, a cDNA 
or gene may be cloned into an appropriate in vitro 
transcription vector, such as pSP64 or pSP65 for in vitro 
transcription, followed by cell- free translation in a 
suitable cell -free translation system, such as wheat germ 
or rabbit reticulocytes. In vitro transcription and 
translation systems are commercially available, e.g., from 
Promega Biotech, Madison, Wisconsin or Gibco-BRL , 
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Gai thersburg / Maryland . 

Alternatively, according to a preferred embodiment, 
larger quantities of MOAT proteins may be produced by 
expression in a suitable prokaryotic or eukaryotic system. 
For example, part or all of a DNA molecule, such as a cDNA 
having Sequence I.D. No. 1, 3, 5 or 7 may be inserted into 
a plasmid vector adapted for expression in a bacterial 
cell, such as E. coli . Such vectors comprise the 
regulatory elements necessary for expression of the DNA in 
the host cell positioned m such a manner as to permit 
expression of the DNA in the host cell. Such regulatory 
elements required for expression include promoter 
sequences, transcription initiation sequences and, 
optionally, enhancer sequences. 

The human MOAT proteins produced by gene expression 
in a recombinant procaryotic or eukaryotic system may be 
purified according to methods known in the art. In a 
preferred embodiment, a commercially available 
expression/secretion system can be used, whereby the 
recombinant protein is expressed and thereafter secreted 
from the host cell, to, be easily purified from the 
surrounding medium. If expression/secretion vectors are 
not used, an alternative approach involves purifying the 
recombinant protein by affinity separation, such as by 
immunological interaction with antibodies that bind 
specifically to the recombinant protein or nickel columns 
for isolation of recombinant proteins tagged with 6-8 
histidine residues at their N-terminus or C-terminus. 
Alternative tags may comprise the FLAG epitope or the 
hemagglutinin epitope. Such methods are commonly used by 
skilled practitioners . 

The human MOAT proteins of the invention, prepared by 
the aforementioned methods, may be analyzed according to 
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standard procedures. For example, such proteins may be 
subjected to amino acid sequence analysis, according to 
known methods. 

The present invention also provides antibodies 
capable of immunospecif ically binding to proteins of the 
invention. Polyclonal antibodies directed toward human 
MOAT proteins may be prepared according to standard 
methods. In a preferred embodiment, monoclonal antibodies 
are prepared, which react immunospeci f ically with the 
various epitopes of the MOAT proteins described herein. 
Monoclonal antibodies may be prepared according to general 
methods of Kohler and Milstein, following standard 
protocols. Polyclonal or monoclonal antibodies that 
immunospecifically interact with MOAT proteins can be 
utilized for identifying and purifying such proteins. For 
example, antibodies may be utilized for affinity 
separation of proteins with which they immunospecifically 
interact. Antibodies may also be used to 
immunoprecipitate proteins from a sample containing a 
mixture of proteins and other biological molecules. Other 
uses of ant i - MOAT antibodies are described below. 

II. Uses of MOAT -Encoding Nucleic Acids, 
MOAT Proteins and Ant- .-i bodies Thereto 

Cellular transporter molecules have received a great 
deal of attention as potential targets of chemotherapeutic 
agents designed to effectively block the export of 
pharmacological reagents from tumor cells. The MOAT 
proteins of the invention play a pivotal role in the 
transport of molecules across the cell membrane. 

Additionally, MOAT nucleic acids, proteins and 
antibodies thereto, according to this invention, may be 
used as research tools to identify other proteins that are 
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intimately involved in the transport of molecules into and 
out of cells. Biochemical elucidation of molecular 
mechanisms which govern such transport will facilitate the 
development of novel ant i - transport agents that may 
sensitize tumor cells to conventional chemotherapeutic 
agents . 

A. MOAT -Encoding Nucleic Acids 

MOAT -encoding nucleic acids may be used for a variety 
of purposes in accordance with the present invention. 
MOAT-encoding DNA, RNA, or fragments thereof may be used 
as probes to detect the presence of and/or expression of 
genes encoding MOAT proteins. Methods in which 
MOAT-encoding nucleic acids may be utilized as 
probes for such assays include, but are not limited to: 
(1) in situ hybridization; (2) Southern hybridization (3) 
northern hybridization; and (4) assorted amplification 
reactions such as polymerase chain reactions (PCR) . 

The MOAT-encoding nucleic acids of the invention may 
also be utilized as probes to identify related genes from 
other animal species. As is well known in the art, 
hybridization stringencies may be adjusted to allow 
hybridization of nucleic acid probes with complementary 
sequences of varying degrees of homology. Thus, 
MOAT-encoding nucleic acids may be used to advantage to 
identify and characterize other genes of varying degrees 
of relation to the MOAT genes of the invention. Such 
information enables further characterization of 
transporter molecules which give rise to the 

chemoresistant phenotype of certain tumors. Additionally, 
they may be used to identify genes encoding proteins that 
interact with MOAT proteins (e.g., by the "interaction 
trap" technique) , which should further accelerate 
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identification of the components involved in the 
acquisition of drug resistance. The MOAT encoding nucleic 
acids may also be used to generate primer sets suitable 
for PCR amplification of target MOAT DNA . Criteria for 
selecting suitable primers are well known to those of 
ordinary skill in the art. 

Nucleic acid molecules, or fragments thereof, 
encoding MOAT genes may also be utilized to control the 
production of MOAT proteins, thereby regulating - the amount 
of protein available to participate in cytotoxic drug 
efflux. As mentioned above, antisense oligonucleotides 
corresponding to essential processing sites in 
MOAT-encoding mRNA molecules may be utilized to inhibit 
MOAT protein production in targeted cells. Alterations in 
the physiological amount of MOAT proteins may dramatically 
affect the ability of these proteins to transport 
pharmacological reagents out of the cell. 

Host cells comprising at least one MOAT encoding DNA 
molecule are encompassed in the present invention. Host 
cells contemplated for use in the present invention 
include but are not limited to bacterial cells, fungal 
cells, insect cells, mammalian cells, and plant cells. 
The MOAT encoding DNA molecules may introduced singly into 
such host cells or in combination to assess the phenotype 
of cells conferred by such expression. Methods for 
introducing DNA molecules are also well known to those of 
ordinary skill in the art. Such methods are set forth in 
Ausubel et al . eds . , Current Protocols in Molecular 
Biology , John Wiley & Sons, NY, NY 1995, the disclosure of 
which is incorporated by reference herein. 

The availability of MOAT encoding nucleic acids 
enables the production of strains of laboratory mice 
carrying part or all of the MOAT genes or mutated 
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sequences thereof . Such mice may provide an in vivo model 
for development of novel chemotherapeutic agents. 
Alternatively, the MOAT nucleic acid sequence information 
provided herein enables the production of knockout mice in 
which the endogenous genes encoding MOAT - B , MOAT-C, MOAT-D 
or MOAT-E have been specifically inactivated. Methods of 
introducing transgenes in laboratory mice are known to 
those of skill in the art. Three common methods include: 
1. integration of retroviral vectors encoding the foreign 
gene of interest into an early embryo; 2. injection of 
DNA into the pronucleus of a newly fertilized egg; and 3. 
the incorporation of genetically manipulated embryonic 
stem cells into an early embryo. 

The alterations to the MOAT gene envisioned herein 
include modifications, deletions, and substitutions. 
Modifications and deletions render the naturally occurring 
gene nonfunctional, producing a "knock out" animal. 
Substitutions of the naturally occurring gene for a gene 
from a second species results in an animal which produces 
an MOAT gene from the second species. Substitution of the 
naturally occurring gene for a gene having a mutation 
results in an animal with a mutated MOAT protein. A 
transgenic mouse carrying the human MOAT gene is generated 
by direct replacement of the mouse MOAT gene with the 
human gene. These transgenic animals are valuable for use 
in vivo assays for elucidation of other medical disorders 
associated with cellular activities modulated by MOAT 
genes. A transgenic animal carrying a "knock out" of a 
MOAT encoding nucleic acid is useful for the establishment 
of a nonhuman model for chemotherapy resistance involving 
MOAT regulation. 

As a means to define the role that MOAT plays in 
mammalian systems, mice can be generated that cannot make 
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MOAT proteins because of a targeted mutational disruption 
of a MOAT gene. 

The term "animal" is used herein to include all 
vertebrate animals, except humans. It also includes an 
individual animal in all stages of development, including 
embryonic and fetal stages. A "transgenic animal" is any 
animal containing one or more cells bearing genetic 
information altered or received, directly or indirectly, 
by deliberate genetic manipulation at the subcellular 
level, such as by targeted recombination or microinjection 
or infection with recombinant virus. The term "transgenic 
animal" is not meant to encompass classical cross-breeding 
or in vitro fertilization, but rather is meant to 
encompass animals in which one or more cells are altered 
by or receive a recombinant DNA molecule. This molecule 
may be specifically targeted to defined genetic locus, be 
randomly integrated within a chromosome, or it may be 
extrachromosomally replicating DNA. The term "germ cell 
line transgenic animal" refers to a transgenic animal in 
which the genetic alteration or genetic information was 
introduced into a germ line cell, thereby conferring the 
ability to transfer the genetic information to offspring. 
If such offspring in fact, possess some or all of that 
alteration or genetic information, then they, too, are 
transgenic animals . 

The alteration or genetic information may be foreign 
to the species of animal to which the recipient belongs, 
or foreign only to the particular individual recipient, or 
may be genetic information already possessed by the 
recipient. In the last case, the altered or introduced 
gene may be expressed differently than the native gene. 

The altered MOAT gene generally should not fully 
encode the same MOAT protein native to the host animal and 
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its expression product should be altered to a minor or 
great degree, or absent altogether. However, it is 
conceivable that a more modestly modified MOAT gene will 
fall within the compass of the present invention if it is 
a specific alteration. 

The DNA used for altering a target gene may be 
obtained by a wide variety of techniques • that include, but 
are not limited to, isolation from genomic sources, 
preparation of cDNAs from isolated mRNA templates, direct 
synthesis, or a combination thereof. 

A preferred type of target cell for transgene 
introduction is the embryonal stem cell (ES) . ES cells 
may be obtained from pre - implantation embryos cultured in 
vitro. Transgenes can be efficiently introduced into the 
ES cells by standard techniques such as DNA transfection 
or by retrovirus-mediated transduction. The resultant 
transformed ES cells can thereafter be combined with 
blastocysts from a non-human animal. The introduced ES 
cells thereafter colonize the embryo and contribute to the 
germ line of the resulting chimeric animal. 

One approach to the problem of determining the 
contributions of individual genes and their expression 
products is to use isolated MOAT genes to selectively 
inactivate the wild-type gene in totipotent ES cells (such 
as those described above ) and then generate transgenic 
mice. The use of gene - targeted ES cells in the generation 
of gene- targeted transgenic mice is known in the art. 

Techniques are available to inactivate or alter any 
genetic region to a mutation desired by using targeted 
homologous recombination to insert specific changes into 
chromosomal alleles. However, in comparison with 
homologous extrachromosomal recombination, which occurs at 
a frequency approaching 100%, homologous plasmid- 
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chromosome recombination was originally reported to only 
be detected at frequencies between 1CT 6 and 10" 3 . 
Nonhomologous plasmid- chromosome interactions are more 
frequent occurring at levels 10 5 -fold to 10 2 -fold greater 
than comparable homologous insertion. 

To overcome this low proportion of targeted 
recombination in murine ES cells, various strategies have 
been developed to detect or select rare homologous 
recombinants. One approach for detecting homologous 
alteration events uses the polymerase chain reaction (PCR) 
to screen pools of transformant cells for homologous 
insertion, followed by screening of individual clones. 
Alternatively, a positive genetic selection approach has 
been developed in which a marker gene is constructed which 
will only be active if homologous insertion occurs, 
allowing these recombinants to be selected directly. One 
of the most powerful approaches developed for selecting 
homologous recombinants is the positive-negative selection 
(PNS) method developed for genes for which no direct 
selection of the alteration exists. The PNS method is 
more efficient for targeting genes which are not expressed 
at high levels because the marker gene has its own 
promoter. Non- homologous recombinants are selected 
against by using the Herpes Simplex virus thymidine kinase 
(HSV-TK) gene and selecting against its nonhomologous 
insertion with effective herpes drugs such as gancyclovir 
(GANC) or (1- (2-deoxy-2- f luoro-B-D arabinof luranosyl ) -5- 
iodouracil, (FIAU) . By this counter selection, the number 
of homologous recombinants in the surviving transf ormants 
can be increased. 

As used herein, a "targeted gene" or "knock-out" is a 
DNA sequence introduced into the germline or a non-human 
animal by way of human intervention, including but not 
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limited to, the methods described herein. The targeted 
genes of the invention include DNA sequences which are 
designed to specifically alter cognate endogenous alleles. 

Methods of use for the transgenic mice of the 
invention are also provided herein. Knockout mice of the 
invention can be injected with tumor cells or treated with 
carcinogens to generate carcinomas. Such mice provide a 
biological system for assessing chemotherapy resistance as 
modulated by a MOAT gene of the invention. Accordingly, 
therapeutic agents which inhibit the action of these 
transporters and thereby prevent efflux of beneficial 
chemotherapeutic agents from tumor cells may be screened 
in studies using MOAT knock out mice. 

As described above, MOAT-encoding nucleic acids are 
also used to advantage to produce large quantities of 
substantially pure MOAT proteins, or selected portions 
thereof . 

B - MOAT Proteins and Antibodies 

Purified full length MOAT proteins, or fragments 
thereof, may be used to produce polyclonal or monoclonal 
antibodies which also may serve as sensitive detection 
reagents for the presence and accumulation of MOAT 
proteins (or complexes containing MOAT proteins) in 
mammalian cells. Recombinant techniques enable expression 
of fusion proteins containing part or all of MOAT 
proteins. The full length proteins or fragments of the 
proteins may be used to advantage to generate an array of 
monoclonal antibodies specific for various epitopes of 
MOAT proteins, thereby providing even greater sensitivity 
for detection of MOAT proteins in cells. 

Polyclonal or monoclonal antibodies 
immunologically specific for MOAT proteins may be used in 
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a variety of assays designed to detect and quantitate the 
proteins. Such assays include, but are not limited to: 
(1) flow cytometric analysis; (2) immunochemical 
localization of NOAT proteins in tumor cells; and (3) 
immunoblot analysis (e.g., dot blot, Western blot) of 
extracts from various cells. Additionally , as described 
above, anti-MOAT antibodies can be used for purification 
of MOAT proteins and any associated subunits (e.g., 
affinity column purification, immunoprecipitat ion) . 

From the foregoing discussion, it can be seen that 
MOAT-encoding nucleic acids, MOAT expressing vectors, MOAT 
proteins and anti-MOAT antibodies of the invention can be 
used to detect MOAT gene expression and alter MOAT protein 
accumulation for purposes of assessing the genetic and 
protein interactions involved in the development of drug 
resistance in tumor cells. 

C. Methods and Kits Employing the 

Compositions of the Present Invention 

From the foregoing discussion, it can be seen 
that MOAT-encoding nucleic acids, MOAT -expressing vectors, 
MOAT proteins and anti-MOAT antibodies of the invention 
can be used to detect MOAT gene expression and alter MOAT 
protein accumulation for purposes of assessing the genetic 
and protein interactions giving rise to chemotherapy 
resistance in tumor cells. 

Exemplary approaches for detecting MOAT nucleic acid 
or polypeptides/proteins include: 

a) comparing the sequence of nucleic acid in the 
sample with the MOAT nucleic acid sequence to determine 
whether the sample from the patient contains mutations; or 

b) determining the presence, in a sample from a 
patient, of the polypeptide encoded by the MOAT gene and, 
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if present, determining whether the polypeptide is full 
length, and/or is mutated, and/or is expressed at the 
normal level; or 

c) using DNA restriction mapping to compare the 
restriction pattern produced when a restriction enzyme 
cuts a sample of nucleic acid from the patient with the 
restriction pattern obtained from normal MOAT gene or from 
known mutations thereof; or, 

d) using a specific binding member capable of binding 
to a MOAT nucleic acid sequence (either normal sequence or 
known mutated sequence) , the specific binding member 
comprising nucleic acid hybridizable with the MOAT 
sequence, or substances comprising an antibody domain with 
specificity for a native or mutated MOAT nucleic acid 
sequence or the polypeptide encoded by it, the specific 
binding member being labelled so that binding of the 
specific binding member to its binding partner is 
detectable; or, 

e) using PCR involving one or more primers based on 
normal or mutated MOAT gene sequence to screen for normal 
or mutant MOAT gene in a sample from a patient. 

A "specific binding pair" comprises a specific 

binding member (sbm) and a binding partner (bp) which have 

a particular specificity for each other and which in 

normal conditions bind to each other in preference to 

other molecules. Examples of specific binding pairs are 

antigens and antibodies, ligands and receptors and 

complementary nucleotide sequences. The skilled person is 

aware of many other examples and they do not need to be 

listed here. Further, the term "specific binding pair" is 

■* 

also applicable where either or both of the specific 
binding member and the binding partner comprise a part of 
a large molecule. In embodiments in which the specific 
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binding pair are nucleic acid sequences, they will be of a 
length to hybridize to each other under conditions of the 
assay, preferably greater than 10 nucleotides long, more 
preferably greater than 15 or 20 nucleotides long. 

In most embodiments for screening for alleles giving 
rise to chemotherapy resistance, the MOAT nucleic acid in 
biological sample will initially be amplified, e.g. using 
PCR, to increase the amount of the analyte as compared to 
other sequences present in the sample. This allows the 
target sequences to be detected with a high degree of 
sensitivity if they are present in the sample. This 
initial step may be avoided by using highly sensitive 
array techniques that are becoming increasingly important 
in the art . 

The identification of the MOAT gene and its 
association with a particular chemotherapy resistance 
paves the way for aspects of the present invention to 
provide the use of materials and methods, such as are 
disclosed and discussed above, for establishing the 
presence or absence in a test sample of a variant form of 
the gene, in particular an allele or variant specifically 
associated with chemotherapy resistance. This may be done 
to assess the propensity of the tumor to exhibit 
chemotherapy resistance . 

In still further embodiments, the present invention 
concerns immunodetection methods for binding, purifying, 
removing, quantifying or otherwise generally detecting 
biological components. The encoded proteins or peptides of 
the present invention may be employed to detect antibodies 
having reactivity therewith, or, alternatively, antibodies 
prepared in accordance with the present invention, may be 
employed to detect the encoded proteins or peptides. The 
steps of various useful immunodetection methods have been 
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described in the scientific literature, such as, e.g., 
Nakamura et al . (1987). 

In general, the immunobinding methods include 
obtaining a sample suspected of containing a protein, 
peptide or antibody, and contacting the sample with an 
antibody or protein or peptide in accordance with the 
present invention, as the case may be, under conditions 
effective to allow the formation of immunocomplexes. 

The immunobinding methods include methods for 
detecting or quantifying the amount of a reactive 
component in a sample, which methods require the detection 
or quantitation of any immune complexes formed during the 
binding process. Here, one would obtain a sample 
suspected of containing a MOAT gene encoded protein, 
peptide or a corresponding antibody, and contact the 
sample with an antibody or encoded protein or peptide , as 
the case may be, and then detect or quantify the amount of 
immune complexes formed under the specific conditions. 

In terms of antigen detection, the biological sample 
analyzed may be any sample that is suspected of containing 
the MOAT antigen, such as a tumor tissue section or 
specimen, a homogenized tissue extract, an isolated cell, 
a cell membrane preparation, separated or purified forms 
of any of the above protein-containing compositions. 

Contacting the chosen biological sample with the 
protein, peptide or antibody under conditions effective 
and for a period of time sufficient to allow the formation 
of immune complexes (primary immune complexes) is 
generally a matter of simply adding the composition to the 
sample and incubating the mixture for a period of time 
long enough for the antibodies to form immune complexes 
with, i.e., to bind to, any antigens present. After this 
time, the sample-antibody composition, such as a tissue 
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section, ELI SA plate, dot blot or Western blot, will 
generally be washed to remove any non-specif ically bound 
antibody species, allowing only those antibodies 
specifically bound within the primary immune complexes to 
be detected. 

In general, the detection of immunocomplex formation 
is well known in the art and may be achieved through the 
application of numerous approaches. These methods are 
generally based upon the detection of a label or marker, 
such as any radioactive, fluorescent, biological or 
enzymatic tags or labels of standard use in the art. U.S. 
Patents concerning the use of such labels include U.S. 
Pat. Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 
4,277,437; 4,275,149 and 4,366,241, each incorporated 
herein by reference. Of course, one may find additional 
advantages through the use of a secondary binding ligand 
such as a second antibody or a biotin/avidin ligand 
binding arrangement, as is known in the art. 

In one broad aspect, the present invention 
encompasses kits for use in detecting expression of MOAT 
encoding nucleic acids in biological samples, including 
biopsy samples. Such a kit may comprise one or more pairs 
of primers for amplifying nucleic acids corresponding to 
the MOAT gene. The kit may further comprise samples of 
total mRNA derived from tissues expressing at least one or 
a subset of the MOAT genes of the invention, to be used as 
controls. The kit may also comprise buffers, nucleotide 
bases, and other compositions to be used in hybridization 
and/or amplification reactions. Each solution or 
composition may be contained in a vial or bottle and all 
vials held in close confinement in a box for commercial 
sale. In a further embodiment, the invention encompasses 
a kit for use in detecting MOAT proteins in chemotherapy 
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resistant cancer cells comprising antibodies specific for 
MOAT proteins encoded by the MOAT nucleic acids of the 
present invention . 

Another aspect of the present invention comprises 
screening methods employing host cells expressing one or 
more MOAT genes of the invention. An advantage of having 
discovered the complete coding sequenced of MOAT B-E 
is that cell lines that overexpress MOATB C D or E can be 
generated using standard transfection protocols. Cells 
that overexpress the complete cDNA will also harbor the 
complete proteins, a feature that is essential for 
biological activity of proteins. The overexpressing cell 
lines will be useful in several ways: l)The drug 
sensitivity of overexpressing cell lines can be tested 
with a variety of known anticancer agents in order to 
determine the spectrum of anticancer agents for which the 
transporter confers resistance; 2) The drug sensitivity of 
overexpressing cell lines can be used to 

determine whether newly discovered anticancer agents are 
transported out of the cell by one of the discovered 
transporters; 3 ) Overexpressing cell lines can be used to 
identify potential inhibitors that reduce the activity of 
the transporters . Such inhibitors are of great 
clinical interest in that they may enhance the activity of 
known anticancer agents, thereby increasing their 
effectiveness. Reduced activity will be detected by 
restoration of anticancer drug sensitivity, or by 
reduction of transporter mediated cellular efflux of 
anticancer agents. In vitro biochemical studies designed 
to identify reduced transporter activity in 

the presence of potential inhibitors can also be performed 
using membranes prepared from overexpessing cell lines ; 
and 4 ) Overexpressing cell lines can also be used to 
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determine whether pharmaceutical agents that are not 
anticancer agents are transported out of the cell by the 
transporters . 



The following protocols are provided to facilitate 
the practice of the present invention. 

Isolation of MOAT - B cDNA 

Forward {CT(A/G/T) GT ( A/G/T ) GC(A/G/T) GT ( A / G / T ) 
GT (A/G/T) GG(A/G/C/T) } (SEQ ID NO : 9 ) and reverse {(G/A)CT 
(A/G/C/T) A(A/G/C) (A/G/C/T)GC (A/G/C/T) (G/C) (T/A) 
(A/G/C/T) A(A/G) (A/G/C/T) GG (A/G/C/T)TC (A/G)TC}(SEQ ID 
NO: 16) degenerate oligonucleotide primers were designed 
based upon the first nucleotide binding folds of human 
MRP, CFTR , and MDR1 . Bacteriophage DNA isolated from a 
C2 00 cDNA library prepared in the XpCEV2 7 phagemid 
vector (17) was used as template in PCR reactions 
containing 2 50 ng cDNA , 5 uM primers, 5 0 mM KCl, 10 mM 
Tris-HCl, pH 8.3, 3 mM MgCl 2 , .05% gelatin, 0.2 ibM dNTP 
and Taq polymerase (Perkin Elmer Cetus) . Five cycles of 
PCR were performed as follows: 94°C for 1 minute, 40°C 
for 2 minutes, 72°C for 3 minutes. Twenty five cycl&s 
were then performed as follows: 94°C for 1 minute, 55°C 
for 1 minute, and 72°C for 1 minute. The resulting 
reaction products were used as template in a second 
round of PCR, as described above, with nested forward 
{ CGGGATCC AG (A/G) G A ( A / G ) AA(C/T) AT(A/C/T) CT(A/G/C/T) 
TTT GG(A/G/C/T) } (SEQ ID NO : 17 ) and reverse {CGGAATTC 
(A/G/T/OTC (A/G)TC (A/C/T)AG (A/G/C/T)AG ( A / G ) T A 
( A/T/ G) AT (A/G)TC}(SEQ ID N0:18) degenerate 
oligonucleotide primers. PCR reaction products were 
isolated from an agarose gel and subcloned into the 

BamHI and EcoRI sites of pBluescript ( Stratagene ) . 

> 

Nucleotide sequence analysis 
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was performed on plasmid DNA prepared from ampicillin 
resistant transformants. Additional cDNA clones were 
isolated from C200 (ovary) and B5 (breast) cDNA libraries 
by plaque hybridization using the PCR product as the 
initial radiolabeled probe. 

RNA Blot Analysis 

Blots containing polyA + RNA isolated from human 
tissues (Clontech) were prehybridized at 45°C for 8 hours 
in 50% formamide, 4X SSC, 4X Denhardt 1 s solution, 0.04 M 
sodium phosphate monobasic, pH 6.5, 0.8% (w/v) glycine, 
0.1 mg/ml sheared denatured salmon sperm DNA. 

Hybridization was performed at 45°C with 32 P-labeled MOAT - B 
or GAPDH probes in a solution containing 50% formamide, 3X 
SSC, 0.04 M sodium phosphate pH 6.5, 10% dextran sulfate, 
0.1 mg/ml sheared denatured salmon sperm DNA. Blots were 
washed 2 times for 15 min at 65°C in 2X SSC, 5 mM Tris-HCl 
pH7.4, 0.5% SDS, 2.5 mM EDTA, 0.1% sodium pyrophosphate pH 
8.0, and subsequently washed 2 times for 15 min in 0 . IX 
SSC. Blots were then subjected to autoradiography. 

Chromosomal localization 

Preparation of metaphase spreads from 
phytohemagglutinin-stimulated lymphocytes of a healthy 
female donor, and fluorescence in situ hybridization and 
detection of immunofluorescence were carried out as 
previously described (18) . A 2.2~kb cDNA clone of MOAT - B 
inserted in pBluescript was biotinylated by nick 
translation in a reaction containing 1 ^g DNA, 20 fiM each 
of dATP, dCTP and dGTP, 1 /iM dTTP, 2 5 mM Tris-HCl, pH 7.5, 
5 mM MgCl 2/ 10 mM IS-mercaptoethanol , lOfiM biot in- 16 -dUTP 
(Boehringer Mannheim) , 2 units DNA polymerase l/DNase 1 
(GIBCO, BRL) and water to a total volume of 50 /il . The 
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probe was denatured and hybridized to metaphase spreads 
overnight at 37°C. Hybridization sites were detected with 
f luorescein-labeled avidin (Oncor) and amplified by 
addition of ant i -avidin antibody (Oncor) and a second 
layer of fluorescein- labeled avidin. The chromosome 
preparations were counterstained with DAP1 and observed 
with a Zeiss Axiophot epif luorescence microscope equipped 
with a cooled charge coupled device camera (Photometries, 
Tucson AZ) operated by a Macintosh computer work station. 
Digitized images of DAPI staining and fluorescein signals 
were captured, pseudo-colored and merged using Oncor Image 
version 1.6 software. 

Isolation of MOAT-C and MOAT - D cDNA 

MOAT-C and MOAT-D cDNA clones were isolated by plaque 
hybridization from bacteriophage cDNA libraries using the 
I.M.A.G.E. clones as the initial probes (ATCC) . 

RNA blot analysis 

Blots containing polyA + RNA isolated from human 
tissues (Clontech) were purchased from Clontech, and 
hybridized with radiolabeled MOAT-C, MOAT-D or actin 
probes according to the manufacturer's directions. 

Chromosomal localization 

Preparation of metaphase spreads from 
phytohemagglutinin-stimulated lymphocytes of a healthy 
female donor, and fluorescence in situ hybridization and 
detection of immunofluorescence were carried out as 
previously described (18) . A MOAT-C probe inserted in 
pBluescript , or MOAT-D probe inserted in pBluescript, was 
biotinylated by nick translation in a reaction containing 
1 ^g DNA , 2 0 /iM each of dATP , dCTP and dGTP , 1 ptM dTTP , 2 5 



39 



O ir 3 to /" ,±.'^Ku O ^ f llJ 



WO 99/49735 PCT/US99/06644 

mM Tris-HCl, pH 7.5, 5 mM MgCl 2 , 10 mM ft-mercaptoethanol , 
lOjiM biotin-16-dUTP (Boehringer Mannheim) , 2 units DNA 
polymerase l/DNase 1 (GIBCO, BRL) and water to a total 
volume of 50 fxl . The probe was denatured and hybridized 
to metaphase spreads overnight at 37°C. Hybridization 
sites were detected with fluorescein- labeled avidin 
(Oncor) and amplified by addition of anti-avidin antibody 
(Oncor) and a second layer of f luorescein- labeled avidin. 
The chromosome preparations were count erstained with DAPI 
and observed with a Zeiss Axiophot epif luorescence 
microscope equipped with a cooled charge coupled device 
camera (Photometries, Tucson AZ) operated by a Macintosh 
computer work station. Digitized images of DAPI staining 
and fluorescein signals were captured, pseudo-colored and 
merged using Oncor Image version 1.6 software. 

The following examples are provided to illustrate 
various embodiments of the invention. They are not 
intended to limit the invention in any way. 

EXAMPLE I 
Isolation of MOAT - B cDNA. 

A degenerate PCR approach was used to isolate 
MRP-related transporters . Degenerate oligonucleotide 
primers were prepared based upon the N-terminal nucleotide 
binding folds of MRP and other eukaryotic transporters, 
and used in conjunction with DNA prepared from an ovarian 
cancer cell line bacteriophage library. Nucleotide 
sequence analysis of one of the resulting PCR products 
indicated that it encoded a segment of a novel nucleotide 
binding fold that was most closely related to MRP and 
cMOAT. Overlapping cDNA clones were isolated from ovarian 
and breast bacteriophage libraries by plaque hybridization 
using the PCR product as the initial probe. A total of 
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5.9 kB of cDNA was isolated. Nucleotide sequence analysis 
revealed two classes of cDNA clones that were about 
equally represented among isolates from each of the two 
bacteriophage libraries. The first class contained an 
open reading frame of 3975 bp that was bordered by in 
frame stop codons located at positions -76 and -42 
(relative to the putative initiation codon) and 3976, and 
encoding a predicted protein of 1325 amino acids, which is 
designated MOAT-B. The open reading frame was followed by 
approximately 2 kB of 3' untranslated sequences. The most 
upstream ATG in the open reading frame was located in the 
sequence context ~ 4 CAAGATGC +4 . The A at position -3 of the 
putative translation initiation codon was in agreement 
with the major feature of the Kozak consensus sequence, 
but the C at position +4 was divergent from the more usual 
G. The second class of cDNA clones was identical to the 
first with the exception of a single nucleotide. These 
clones harbored an additional T following nucleotide 3872 
of the first class of clones, close to the C-terminus of 
the predicted protein. This additional nucleotide 
resulted in a frame shift such that the predicted protein 
of the second class of cDNA clones was 22 residues shorter 
than that of the first class of cDNA clones, and in which 
the C-terminal 34 residues of the latter reading frame 
were replaced by 12 distinct residues. See brief 
description of Figure 1 . 

Analysis of the MOAT-B Predicted Structure. 

Comparison of the MOAT-B predicted protein with 
complete coding sequences in protein data bases using the 
BLAST program indicated that it shared significant 
similarity with several eukaryotic ABC transporters . 
Table I . 
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Table I. Comparison of peptide domains of MOAT-B with 
those of other eukaryotic ABC transporters 



MOAT-B 
Domain 
(peptide) 


TM1 
(88-376) 


NBFl 
(428-576} 


linker TM2 
region 
(577-705) (706-992) 


NBF2 

(105S- 
1216) 


O 

terminus 
(1217- 
1325) 


overal 1 
identity 






percent identity 










MRP human 


28 . 6 


55 - 6 


27 9 


33 .3 


61 . 6 


51 . 6 


39.2 


YCF1 yeast 


27 


56 


27 . 9 


34 


57 . 2 


48.5 


38.9 


MOAT human 


33 .2 


53 . 3 


32 . 8 


31.4 


55.3 


44 . 9 


38 


CFTR Human 


30 .5 


48 


27 . 9 


37 . 7 


44 


21 


36.3 


SUR rat 


28.1 


41 . 3 


28.2 


30 


52 . 8 


42 . 8 


32.9 


MDR1 human 


17 . 6 


39.2 


21.1 


17.3 


32.2 


40.3 


23.3 



B The indicated domains are, TMl : segment containing the 

transmembrane spanning domain N-terminal to NBFl; NBFl and NBF2 : 
nucleotide binding folds 1 and 2; Linker region: segment located 
between NBFl and TM2 ; TM2 : segment containing the transmembrane spanning 
domain located between the two NBFs ; C- terminus : segment between NBF2 
and the C-terminus of the proteins. Sequence alignments were generated 
using the PILEUP program of the GCC package. Percent amino acid 
identity with MOAT-B domains are shown. 



Typical features of eukaryotic ABC transporters were 
present in the predicted MOAT-B protein. See Figure 1. 
Overall the protein was composed of a tandem repeat of a 
nucleotide binding fold appended C- terminal to a 
hydrophobic domain that contained several potential 
transmembrane spanning helices. Conserved Walker A and B 
ATP binding sites were present in each of the nucleotide 
binding folds. See Figure 2A. In addition, a conserved C 
motif, the signature sequence of ABC transporters, was 
present in each nucleotide binding fold. Analysis of 
potential transmembrane motifs using the TMAP program (19) 
and an input sequence alignment of MOAT-B and MOAT-C, a 
transporter highly related to MOAT-B 4 , predicted 12 
transmembrane helices with 6 transmembrane segments in 
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each of the two hydrophobic domains. This 6+6 
configuration of predicted transmembrane helices is in 
agreement with topological models proposed for MRP and 
other ABC transporters (2 0, 21) , and is shown in Figure 1. 
However, alternative predictions of transmembrane segments 
were obtained using different program parameters or input 
sequence alignments. For example, when the TMAP program 
was used with an input sequence alignment consisting of 
human MRP, rat cMOAT , rat sulfonyl urea receptor (SUR) , 
human cystic fibrosis conductance regulator { CFTR) and 
human P-glycoprotein, a 6+5 configuration was 
predicted. The only substantial difference between the 
latter prediction and the structure shown in Figure 1 is 
that transmembrane segments 9 (829-853) and 10 (855-878) 
were replaced by a single predicted transmembrane segment 
spanning amino acids 847 - 875. 

Among ABC transporters, the degree of similarity of 
the nucleotide binding folds is considered to be the best 
indicator of functional conservation. Comparison of the 
nucleotide binding folds of MOAT - B with other eukaryotic 
ABC transporters indicated that it was most closely 
related to MRP, the yeast cadmium resistance protein 
(YCF1) and cMOAT (Table I) , three transporters that have 
organic anions as substrates. The MOAT - B NBF1 was 55.6, 
56.0 and 53.3 percent identical, and the MOAT - B NBF2 was 
61.6, 57.2 and 55.3 percent identical to the first and 
second nucleotide binding folds of human MRP , YCF1 and 
human cMOAT, respectively. Aside from the latter 
transporters, the MOAT - B nucleotide binding folds were 
most closely related to those of CFTR and SUR. The MOAT - B 
nucleotide binding folds shared significantly less 
similarity with those of MDR1 . Alignment of the MOAT - B 
nucleotide binding folds with those of other eukaryotic 
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transporters is shown in Figure 2A. Analysis of the 
overall amino acid identity of MOAT-B with other ABC 
transporters also indicated that it was most closely 
related to MRP , YCF1 and cMOAT (Table I) . Overall MOAT-B 
was 39.2, 38.9 and 38 percent identical to these 
transporters, respectively. Figure 2B shows a comparison 
of the hydropathy profiles of MOAT-B with those of other 
eukaryotic transporters. This comparison reveals that 
MOAT-B (1325 amino acids) is approximately 200 amino acids 
smaller than MRP (1531 residues) , cMOAT (1545 residues) 
and YCF1 (1515 residues) , and that this size difference is 
largely accounted for by the absence in MOAT-B of an amino 
terminal hydrophobic extension that is present in MRP, 
cMOAT and YCF1 (22) . This N-terminal hydrophobic segment 
is predicted to harbor several transmembrane spanning 
segments, and is also present in SUR. 
Expression Pattern of MOAT-B in Human Tissues. 

To gain insight into the possible function of MOAT-B, 
its expression pattern in a variety of human tissues was 
examined by RNA blot analysis. As shown in Figure 3, a 
MOAT-B transcript of approximately 6 kB was readily 
detected. The isolation of 5 . 9 kB of MOAT-B cDNA was 
consistent with this size. MOAT-B expression was detected 
in each of the 16 tissues analyzed. Transcript levels were 
highest in prostate and lowest in liver and peripheral 
blood leukocytes, for which prolonged exposure of film 
were required to detect expression. Intermediate levels of 
expression were observed in other tissues. 
Chromosomal Localization of the MOAT-B Gene. 

The MOAT-B chromosomal localization was determined by 
fluorescence in situ hybridization. As shown in Figure 4, 
hybridization of the MOAT-B probe to metaphase spreads 
revealed specific labeling at human chromosome band 13q32 . 
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Fluorescent signals were detected on chromosome 13 in each 
of 19 metaphase spreads scored. Of 135 signals observed/ 
62 (46%) were on 13q. Among these signals, 61 localized 
at 13q32, near the boundary between 13q31 and 13q32. 
Paired (on sister chromatids) signals were only seen at 
band 13q32 . In several metaphases, signals on a single 
chromatid were observed at chromosome bands 6p21 or 4q21, 
suggesting hybridization to distantly related sequences. 



EXAMPLE II 
Isolation of MOAT-C and MOAT-D cDNA. 

Isolation of the MOAT- B 4 transporter as described 
above suggested the possibility that there were other 
MRP /cMOAT- related transporters. A blast search (36) of 
the nonredundent expressed sequence tag data base using 
MRP and related yeast transporters revealed two clones 
with significant similarity to MRP and cMOAT . The first 
of these sequences (I.M.A.G.E. consortium clone 113196) 
was 1.2 kb in length, 800 bp of which encoded an 
MRP-related peptide. A segment of this clone was used as 
a probe to screen ovarian and hematopoietic bacteriophage 
libraries. Analysis of these cDNA clones indicated that 
they contained approximately 2 kb of additional coding 
sequence not present in clone 113196. An additional 1655 
bp of 5 1 sequence was obtained by several rounds of RACE 
using the bacteriophage DNA prepared from the ovarian cDNA 
library as template. The continuity of the sequences 
obtained by RACE with the cDNA clones isolated from 
bacteriophage libraries was confirmed by nucleotide 
sequence analysis of a 2 kb product obtained by RT/PCR 
using an upstream oligonucleotide primer located at the 5 1 
end of the RACE sequence and a downstream primer located 
at the 5 1 end of the cDNA obtained by plaque 
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hybridization. A total of approximately 5.9 kb of cDNA 
sequences were isolated. Nucleotide sequence analysis 
revealed an open reading frame of 4311 bp that was 
preceded by an in frame stop codon located at positions 
-93 (relative to the putative initiation codon), and 
encoding a predicted protein of 1437 amino acids, which is 
designated MOAT-C herein. The open reading frame was 
followed by approximately 1.4 kB of 3' untranslated 
sequences in which a polyadenylat ion sequence ( AAUAAA) was 
located 20 bp upstream of the poly (A) tail. The most 
upstream ATG in the open reading frame was located in the 
sequence context ~ 4 GAAGATGA 4;1 . The A at position -3 of the 
putative translation initiation codon was in agreement 
with the major feature of the Kozak consensus sequence, 
but the A at position +4 was divergent from the more usual 
G (37) . The second sequence identified in our data base 
search (I.M.A.G.E. consortium clone 208097) was 1.2 kb in 
length, of which 588 bp encoded an MRP-related peptide. A 
segment of this clone was used as a probe to screen liver 
and monocyte bacteriophage cDNA libraries, and 5' cDNA 
segments of the isolated cDNA clones were used in a 
subsequent round of screening. Together approximately 5.2 
kb of cDNA sequence were isolated. Nucleotide sequence 
analysis revealed an open reading frame of 4 570 bp, which 
is designated MOAT - D herein. The open reading frame was 
followed by approximately 0.6 kb of 3' untranslated 
sequences in which a polyadenylat ion sequence (AAUAAA) was 
located 12 bp upstream of the poly (A) tail. An upstream 
in frame stop codon was not present in the MOAT - D cDNA 
clones, and attempts to obtain additional upstream 
sequences by RACE using as template cDNA prepared from 
sources in which MOAT-D is abundant were not successful . 
The most upstream ATG in the open reading frame 
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(nucleotide position 5-7) , located in the sequence context 
~ 4 ATGGATGG +4 , was therefore designated as the translat ional 
initiation site. The G at position + 4, was in good 
agreement with the Kozak consensus sequence, but the T at 
~3 was divergent from the more usual A (37) . Although an 
upstream in frame stop codon was not identified in the 
MOAT - D cDNA clones, the size of the encoded protein was 
within one amino acid of the size of the transporter with 
which it shares the highest degree of identity (MRP) , 
suggesting that the complete MOAT-D open reading frame was 
present in the isolated cDNA clones. 

Analysis of the MOAT-C and MOAT - D Predicted Proteins. 

Comparison of the MOAT-C and MOAT-D predicted 
proteins with complete coding sequences in protein data 
bases using the BLAST program indicated that they shared 
significant similarity with several eukaryotic ABC 
transporters. Typical features of eukaryotic ABC 
transporters were present in the predicted proteins. See 
Figure 5. Overall the proteins were composed of 
hydrophobic domains containing potential transmembrane 
spanning helices and two nucleotide binding folds. 
Conserved Walker A and B ATP binding sites, as well as a 
conserved C motif, the signature sequence of ABC 
transporters, was present in the nucleotide binding folds. 
Computer assisted analysis of potential transmembrane 
helices of MOAT-C using the TMAP program (19) predicted 12 
transmembrane helices with 6 transmembrane spanning 
helices in each of two membrane spanning domains. This 6 
+ 6 (TM1-TM6 and TM7-TM12) configuration of predicted 
transmembrane helices is in agreement with topological 
models proposed for several other ABC transporters (20, 
21), and is shown in Figure 5. However, alternative 
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predictions of transmembrane segments were obtained using 
different program parameters or input sequence alignments. 
Comparison of the hydropathy profiles of MOAT-C with other 
MRP/cMOAT-related transporters (Fig. 6B) indicates that 
its structure is similar to that of MOAT-B, which also has 
two membrane spanning domains . 

In contrast to MOAT-C, hydrophobici ty analysis of 
MOAT-D indicated that it has three membrane spanning 
domains. Similar to MRP, cMOAT and the yeast cadmium 
resistance factor 1 (YCF1) , MOAT-D has an additional 
N- terminal hydrophobic domain that is not present in 
MOAT-B or MOAT-C (Figs. 5 and 6). A 5+6+6 configuration 
of transmembrane spanning helices has been proposed for 
MRP (38 ) , in which the N- terminal extension harbors 5 
transmembrane spanning helices, and 6 transmembrane 
helices are present in the second and third membrane 
spanning domain. An alignment of the MOAT-D predicted 
protein with MRP using the GAP program indicated that 
proposed MRP transmembrane spanning helices were conserved 
in MOAT-D. This 5+6+6 model for MOAT-D is shown in Fig. 5. 
Another configuration of transmembrane spanning helices 
(5+6+4) was predicted using computer assisted analysis. 
MRP has been reported to have two N- linked glycosylation 
sites in its N- terminus (Asn-19 and Asn-23) and another 
site located between the first and second transmembrane 
spanning helix of its third membrane spanning domain 
(Asn-1006) . The alignment of MOAT-D with MRP indicated 
that an N-terminal (Asn-21) and a distal N-glycosylat ion 
sites (Asn-1008/1009) were conserved in analogous 
positions in MOAT-D. Only the distal N-glycosylat ion site 
of MRP is conserved in MOAT-C (Asn890) (Fig. 5) and MOAT-B 4 
(Asn746/754) . 

Among ABC transporters, the degree of similarity of 
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the nucleotide binding folds is considered to be the best 
indicator of functional conservation. Comparison of the 
nucleotide binding folds of MOAT-C and MOAT-D with other 
eukaryotic ABC transporters indicated that they were most 
closely related to those of human MRP , human cMOAT and 
yeast YCF1 , three transporters that have organic anions as 
substrates . As shown in Table 2 , among the human 
transporters, the MOAT-C NBFl was about equally related to 
MOAT-D, MRP and cMOAT (55-61% identity) , and less similar 
to MOAT-B (49% identity) . 

Table II. Amino acid identity: nucleotide binding folds 1 
and 2 of MRP / cMOAT sub-family members. 
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The MOAT-C NBF2 shared about equal amino acid identity 
with the five other transporters in this group (59-61% 
identity) . Overall, the MOAT-C protein was about equally 
related to the other five transporters in this group, with 
33.1-36.5% identity. Aside from these 
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transporters, MOAT-C is most closely related to CFTR , with 
which its NBFs shared 44%/42 % identity, and SUR, with 
which its NBFs shared 49%/51% identity. 

The MOAT-D NBFs were clearly most closely related to 
those of MRP and cMOAT, with which they shared considerable 
amino acid identity (67.3-73.8%). See Table III, Of the 
latter two transporters, the MOAT-D NBFs were slightly more 
related to those of MRP. In contrast, the MOAT-D NBFs 
shared only 55.3-58.9% identity with those of MOAT-C and 
MOAT-B . Overall, MOAT-D was again most closely related to 
MRP (57.3%) and cMOAT (46.9%), but significantly more 
related to MRP. Consistent with the analysis of NBFs, 
MOAT-D was much less related to MOAT-C and MOAT-B, with 
which it shared only 33.1% and 35.3% identity, 
respectively. Alignment of the MOAT-C and MOAT-D nucleotide 
binding folds with those of other eukaryotic transporters 
is shown in Fig. 6. 



Table III. Overall amino acid identifying among MRP/cMOAT 
sub- family members 
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Expression Pattern of MOAT-C and MOAT-D in Human Tissues, 

To gain insight into the possible functions of MOAT-C 
and MOAT-D, their expression patterns in a variety of human 
tissues was examined by RNA blot analysis. As 
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shown in Fig. 7 (upper panels) , a MOAT-C transcript of 
approximately 6.6 kB was readily detected in several 
tissues. MOAT-C transcript levels were highest in 
skeletal muscle, with intermediate levels in kidney, 
testes, heart and brain. Low levels were detected in most 
other tissues, including spleen, thymus, prostate, ovary, 
and placenta. Prolonged exposures were required for 
detection in lung and liver. MOAT - D was expressed as an 
approximately 6 kb transcript {middle panels) . Compared 
to MOAT-C, the MOAT-D expression pattern was more 
restricted. MOAT-D was highly expressed in colon and 
pancreas, with lower levels in liver and kidney. Low 
levels were detected in small intestine, placenta and 
prostate. Prolonged exposures were required to detect 
MOAT- D in testes, thymus, spleen and lung. 

Chromosomal localization of the MOAT-C and MOAT-D genes. 

The MOAT-C and MOAT-D chromosomal localizations 
were determined by fluorescence in situ hybridization. As 
shown in Figure 8, hybridization of the MOAT-C probe to 
metaphase spreads revealed specific labeling at human 
chromosome band 3q27. Fluorescent signals were detected 
on chromosome 3q in each of 22 metaphase spreads scored. 
Of 75 signals observed, 43 (57%) were on 3q. Paired (on 
sister chromatids) signals were only seen at band 3q27. 
Hybridization of the MOAT-D probe revealed specific 
labeling at human chromosome band 17q21.3. Fluorescent 
signals were detected on chromosome 17 in each of 21 
metaphase spreads scored. Of 83 signals observed, 34 
(41%) were on 17q21.3. Paired (on sister chromatids) 
signals were only seen at band 17q21.3. 
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EXAMPLE III 
Isolation of MOAT - E and MOAT - E cDNA. 

Analysis of ara, a reported cDNA sequence that 
encodes a 453 amino acid transporter, revealed that it is 
a non-physiological sequence representing a combination 
of 5' MRP sequences fused to an MRP / cMOAT- related 
transporter. The MRP sequences extend to codon 8 of the 
reported predicted protein. 

To isolate the complete physiological cDNA, a RT/PCR 
approach was employed in which primers were designed 
based upon a reported genomic sequence that encodes exons 
identical to the reported ara sequence. The MOAT - E cDNA 
was isolated in three segments. The first segment, 
spanning residues 1-616, was isolated by PCR using 5* 
primer ATGGCCGCGCCTGCTGAGC ; (SEQ ID NO: 10) and 3* primer 
GTCTACGACACCAGGGTCAA (SEQ ID NO: 11) . The second 
segment, spanning residues 1815-3187, was isolated by PCR 
using 5' CTGCCTGGAAGAAGTTGACC (SEQ ID NO: 12) and 3' 
primer CTGGAATGTCCACGTCAACC (SEQ ID NO: 13) . The third 
segment, spanning residues 3158-1503, was isolated by PCR 
using 5' primer GGAGACAGACACGGTTGACG (SEQ ID NO: 14) and 
3' primer G C AG AC C AG G C CT G ACT C C (SEQ ID NO: 15). The 
primer were designed based upon the nucleotide sequence 
of human genomic BAC clone CIT987SD- 962B4 . The template 
for these reactions was random-primed human kidney cDNA 
prepared from total RNA. Using this approach the 
physiological cDNA was isolated which is designated 
MOAT - E herein and set forth as Sequence I.D. No. 7. 

Analysis of the MOAT - E Predicted Protein. 

MOAT- E encodes a 15 0 3 amino acid transporter . The 
MOAT-E predicted amino acid sequence is designated 
Sequence I.D. No . 8 . See Figure 9 . Also shown is the 
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location of potential transmembrane helices (overbars) , 
potential N-glycosylation site (black dot) and the two 
nucleotide binding folds (NBF1 and NBF2 ) . Walker A and B 
motifs, as well as the signature C motif of ABC 
transporters are also indicated. Comparison of MOAT-E 
with ara indicates that the ara predicted protein is not 
only a fused sequence, but also that it represents only 
446 (-30%) of the 1503 MOAT-E residues. 

Comparison of MOAT-E with the other members of the 
MRP/cMOAT subfamily, which include MRP , cMO AT , MOAT-B, 
MOAT-C and MOAT-E , is shown in Table IV. MOAT-E is 
highly related to MOAT-D, MRP and cMOAT , with which it 
shares 39-45% identity. This high degree of identity is 
also indicated by the high percent identities of the 
nucleotide binding folds, which range from 55-61%. In 
contrast, MOAT-E is less related to MOAT-B and MOAT-C, 
with which it shares -31% and 34% identity, respectively. 

Table IV. Amino acid identity among MRP/cMOAT sub-family 
members. 3 The bold type indicates the percent identity of 
the overall proteins, and the parentheses indicates the 
percent identity of the nucleotide binding 
folds . 
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"overall amino acid identifies are indicated in bold-face, and identities of 
nucleotide binding folds 1 and 2 are indicated m parentheses (NBF1/NBF2) - 
b percent identity was obtained using the GAP command in the GCG package. 
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Comparison of the hydropathy profile of MOAT - E with 
other members of the MRP/cMOAT subfamily if shown in 
figure 10. The data reveal that MOAT - E has a hydrophobic 
N-terminal segment that is present in its closest 
relatives, MOAT - D , MRP and cMOAT . This structural 
feature is present in all of the currently known organic 
anion transporters, .and suggests that MOAT-E may share 
substrate specificity with MRP and cMOAT . MOAT-E may 
also share the drug resistance activity of the latter two 
proteins. In contrast, MOAT-B and MOAT-C do not have 
this hydrophobic N-terminal extension. 

Expression Pattern of MOAT-E in Human Tissues. 

In a Northern blot of RNA isolated from various 
tissues, MOAT-E expression is restricted to liver and 
kidney, suggesting that MOAT-E may participate the 
excretion of substances into the urine and bile. See 
Figure 11. This figure also shows that MOAT-E is 
expressed as an -6 kB transcript. This is in contrast to 
the -2.3 kB transcript that was reported for ara, clearly 
indicating that the fused ara transcript is unique to the 
cell line from which it was isolated, and is not a 
physiological transcript. Together, the isolation of 
MOAT-E and analysis of its sequence and expression 
pattern suggest that it may be involved in cellular 
resistance to drugs and/or the excretion of drugs into 
the urine and bile . 

DISCUSSION 

The present invention discloses additional 
MRP/cMOAT -related transporters which were identified by 
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using a degenerative PCR cloning approach in which the 
conserved amino terminal ATP-binding domain of known 
eukaryotic transporters was targeted. Using this 
approach the complete coding sequences of MOAT-B, MOAT-C, 
MOAT-D and MOAT - E were obtained. MOAT-B is a protein 
whose predicted structure indicates that it is a member 
of the ABC transporter family. Comparison of the MOAT-B 
predicted protein with other transporters reveals that it 
is most closely related to MRP , cMOAT and yeast YCF1 , and 
thus extends the number of known full length MRP-related 
transporters. The similarity of MOAT-B to these 
transporters suggest that it shares a similar substrate 
specificity. Transport assays using membrane vesicle 
preparations indicate that MRP is capable of transporting 
diverse organic anions, including glutathione 
S-conjugates such as LTC 4/ oxidized glutathione, and 
glucuronidated and sulfated conjugates of steroid 
hormones and bile salts (7) . Although membrane vesicle 
transport assays of substrate specificity using 
cMOAT- transf ected cells have not yet been reported, 
genetic and biochemical studies using TR- and EHBR rat 
strains, which are defective in the hepatobiliary 
excretion of glutathione and glucuronate conjugates, 
indicate that it is also an ATP-dependent transporter of 
organic anions. cMOAT, which is primarily expressed in 
the canalicular membrane of hepatocytes, has been 
reported to be absent in these rat strains, and 
hepatocyte canalicular membranes prepared from the mutant 
rats are deficient in the ATP-dependent transport of 
glutathione and glucuronate conjugates (23, 24) . In 
addition, cMOAT protein has also been reported to be 
absent in the hepatocytes of patients with Dubin- Johnson 
syndrome (25) , a disorder manifested by chronic 
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conjugated hyperbilirubinemia. YCF1 , a yeast 
transporter, has also been demonstrated to transport 
glutathione complexes (26) . Thus, based upon the 
similarity of MOAT-B to these three transporters, it is 
possible that it also functions to transport organic 
anions, an activity critical to the cellular 
detoxification of a wide range of xenobiotics. 

MOAT-C, MOAT - D and MOAT - E are three other 
MRP/cMOAT-related transporters. The isolation of these 
two transporters extends the number of known full length 
members of this subfamily to six. Based upon the degree 
of amino acid similarity and overall topology these six 
proteins fall into two groups. The first group is 
composed of MOAT - D , MOAT-E, MRP and cMOAT . These four 
transporters are highly related, sharing -39-45% amino 
acid identity. MOAT-D is more closely related to MRP 
(57% identity) than is cMOAT (48% identity) , and is 
therefore the closest known relative of MRP. In addition 
to a high degree of amino acid identity, the similarity 
between MOAT-D, MRP and cMOAT , also extends to overall 
topology. Like MRP and cMOAT, MOAT-D and MOAT-E have 
three membrane spanning domains, including an N- terminal 
hydrophobic extension that is predicted to harbor -5 
transmembrane helices, and which is absent in 
transporters such as CFTR and MDR1 . This N-terminal 
extension is also present in YCF1, a related yeast 
transporter that transports glutathione S-conjugates , and 
SUR, a more distantly related transporter involved in the 
regulation of potassium channels. The second group of 
MRP/ cMO AT- related transporters is composed of MOAT-B and 
MOAT-C. These two transporters are distinguished from the 
first group by their lower level of amino acid similarity 
and distinct topology. Like MOAT-D and MOAT-E, MOAT-B 
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and MOAT-C are more closely related to MRP (39% and 36% , 
respectively) and cMOAT (37% and 36%, respectively) than 
to other eukaryotic transporters . However, they share 
considerably less similarity with MRP, cMOAT, MOAT-D and 
MOAT - E than the latter four transporters share with each 
other (-39-45% identity). In addition, in contrast to 
MRP , cMOAT, MOAT-D and MOAT - E , MOAT-B and MOAT-C do not 
have an N- terminal membrane spanning domain, and their 
topology is therefore more similar to many other 
eukaryotic ABC transporters that also have only two 
membrane spanning domains. 

Defining the contributions of MOAT-B, MOAT-C, MOAT-D 
and MOAT-E to cytotoxic drug resistance will facilitate 
the design of novel chemotherapeut ic agents. The 
multidrug resistance activity of MRP is well described. 
While the drug sensitivity pattern of cMOAT- trans fee ted 
cells has not yet been reported, the possibility that it 
may also confer resistance to cytotoxic drugs is 
suggested by a recent report in which transfection of a 
cMOAT ant i sense vector was found to enhance the 
sensitivity of a human liver cancer cell line to both 
natural product drugs and cisplatin. Since MOAT-D and 
MOAT-E are more closely related to MRP than is cMOAT, the 
possibility that they will also confer resistance is 
particularly intriguing. The availability of the MOAT-B, 
MOAT-C, MOAT-D and MOAT-E cDNAs will facilitate the 
analysis of their possible contributions to cytotoxic 
resistance . 
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While certain of the preferred embodiments of the 
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exemplified above, it is not intended that the invention 
be limited to such embodiments. Various modifications 
may be made thereto without departing from the scope and 
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following claims. 
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What is claimed is: 



1 . 



An isolated nucleic acid molecule having the 



sequence of SEQ ID NO : 1 , said nucleic acid molecule 
comprising a nucleotide sequence encoding a MOAT-B 
transporter protein about 1350 amino acids in length, 
said encoded transporter protein comprising a 
mult i -domain structure including a tandem repeat of 
nucleotide binding folds appended C-terminal to a 
hydrophobic domain, said nucleotide binding folds having 
Walker A and B ATP binding sites, said C-terminal domain 
having a plurality of membrane spanning helices. 

2. The nucleic acid molecule of claim 1, which 

is DNA. 



cDNA comprising a sequence approximately 5.9 kilobase 
pairs in length that encodes said MOAT - B transporter 
protein- 



gene comprising introns and exons , the exons of said gene 
specifically hybridizing with the nucleic acid of SEQ ID 
NO 1, and said exons encoding said MOAT - B transporter 
protein . 

5. An isolated RNA molecule transcribed from 
the nucleic acid of claim 1. 

6. The nucleic acid molecule of claim 1, 
wherein said sequence encodes a MOAT - B transporter 



3 . 



The DNA molecule of claim 2, which is a 



4 . 



The DNA molecule of claim 2, which is a 
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protein having an amino acid sequence selected from the 
group consisting of SEQ ID NO 2 and amino acid sequences 
encoded by natural allelic variants of said sequence. 

7. The nucleic acid molecule of claim 6, which 
comprises SEQ ID NO 1 . 

8. An antibody immunologically specific for 
the protein encoded by the nucleic acid of claim 1. 

9. An antibody as claimed xn claim 8, said 
antibody being monoclonal. 

10. An antibody as claimed in claim 8, said 
antibody being polyclonal. 

11. An isolated nucleic acid molecule having 
the sequence of SEQ ID NO: 3, said nucleic acid molecule 
comprising a sequence encoding a MOAT-C transporter 
protein about 1450 amino acids in length, said 
transporter protein having a multi-domain structure 
including a tandem repeat of nucleotide binding folds, 
said nucleotide binding foldes having Walker A and B 
binding sites, and a C-terminal hydrophobic domain that 
contains several membrane spanning helices. 

12. The nucleic acid molecule of claim 11, which is 

DNA. 

13. The DNA molecule of claim 12, which is a cDNA 
comprising a sequence approximately 6.6 kilobase pairs in 
length that encodes said MOAT-C transporter protein. 
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14. The DNA molecule of claim 12, which is a gene 
comprising introns and exons, the exons of said gene 
specifically hybridizing with the nucleic acid of SEQ ID 
NO 3, and said exons encoding said MOAT-C transporter 
protein . 

15. An isolated RNA molecule transcribed from the 
nucleic acid of claim 11. 



16. The nucleic acid molecule of claim 11, wherein 
said sequence encodes a MOAT-C transporter protein having 
an amino acid sequence selected from the group consisting 
of SEQ ID NO 4 and amino acid sequences encoded by 
natural allelic variants of said sequence. 

17. The nucleic acid molecule of claim 11, which 
comprises SEQ ID NO 3. 

IS . An antibody immunologically specific for the 
protein encoded by the nucleic acid of claim 11. 

19. An antibody as claimed in claim 18, said 
antibody being monoclonal, 

20. An antibody as claimed in claim 18, said 
antibody being polyclonal . 

21. An oligonucleotide between about 10 and about 
200 nucleotides in length, which specifically hybridizes 
with a protein translation initiation site in a 
nucleotide sequence encoding ammo acids of SEQ ID NO 4. 
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22. An oligonucleotide between about 10 and about 
200 nucleotides in length, which specifically hybridizes 
with a protein translation initiation site in a 
nucleotide sequence encoding amino acids of SEQ ID NO 2 . 

23 . An isolated nucleic acid molecule having the 
sequence of SEQ ID NO; 5, said nucleic acid molecule 
comprising a sequence encoding a MOAT-D transporter 
protein about 1550 amino acids in length, said 
transporter protein having a multi-domain structure 
including a tandem repeat of nucleotide binding folds, 
said nucleotide binding foldes having Walker A and B 
binding sites, and a C- terminal hydrophobic domain that 
contains several membrane spanning helices. 



24. The nucleic acid molecule of claim 23, which is 

DNA. 

25. The DNA molecule of claim 24, which is a cDNA 
comprising a sequence approximately 6 kilobase pairs in 
length that encodes said MOAT-D transporter protein. 

26. The DNA molecule of claim 24, which is a gene 
comprising introns and exons, the exons of said gene 
specifically hybridizing with the nucleic acid of SEQ ID 
NO 5, and said exons encoding said MOAT-D transporter 
protein. 

21. An isolated RNA molecule transcribed from the 
nucleic acid of claim 23. 

28. The nucleic acid molecule of claim 23, wherein 
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said sequence encodes a MOAT - D transporter protein having 
an amino acid sequence selected from the group consisting 
of SEQ ID NO 6 and amino acid sequences encoded by 
natural allelic variants of said sequence. 

29. The nucleic acid molecule of claim 23, which 
comprises SEQ ID NO 5. 

30. An antibody immunologically specific for the 
protein encoded by the nucleic acid of claim 23 . 

31. An antibody as claimed in claim 30, said 
antibody being monoclonal . 

32. An antibody as claimed in claim 30, said 
antibody being polyclonal . 

•33. An oligonucleotide between about 10 and about 
200 nucleotides in length, which specifically hybridizes 
with a protein translation initiation site in a 
nucleotide sequence encoding amino acids of SEQ ID NO 6 . 

34. An isolated nucleic acid molecule having the 
sequence of SEQ ID NO : 7 , said nucleic acid molecule 
comprising a nucleotide sequence encoding a MOAT-E 
transporter protein about 1503 amino acids in length, 
said transporter protein having a multi-domain structure 
including a tandem repeat of nucleotide binding folds, 
said nucleotide binding folds having Walker A and B 
binding sites, and a C- terminal hydrophobic domain that 
contains several membrane spanning helices. 

35. The nucleic acid molecule of claim 34, 
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36. The DNA molecule of claim 35, which is a 
cDNA comprising a sequence approximately 6 kilobase pairs 
in length that encodes said MOAT-E transporter protein. 

37. The DMA molecule of claim 35, which is a 
gene comprising introns and exons , the exons of said gene 
specifically hybridizing with the nucleic acid of SEQ ID 
NO 7, and said exons encoding said MOAT-E transporter 
protein . 

38. An isolated RJSFA molecule transcribed from 
the nucleic acid of claim 34. 

39. The nucleic acid molecule of claim 34, 
wherein said sequence encodes a MOAT-E transporter 
protein having an amino acid sequence selected from the 
group consisting of SEQ ID NO 8 and amino acid sequences 
encoded by natural allelic variants of said sequence. 

40. The nucleic acid molecule of claim 39, 
which comprises SEQ ID NO 7 . 

41. An antibody immunologically specific for 
the protein encoded by the nucleic acid of claim 34. 

42. An antibody as claimed in claim 41, said 
antibody being monoclonal. 

43. An antibody as claimed in claim 41, said 
antibody being polyclonal. 
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44 . An oligonucleotide between about 10 and 
about 200 nucleotides in length, which specifically 
hybridizes with a protein translation initiation site in 
a nucleotide sequence encoding amino acids of SEQ ID NO 
7 . 

45. A plasmid comprising a nucleotide sequence 
selected from the group consisting of SEQ ID NO: 1, SEQ 
ID NO: 3, SEQ ID NO: 5 and SEQ ID NO : 7 . 

46. A vector comprising a nucleotide sequence 
selected from the group consisting of SEQ ID NO: 1, SEQ 
ID NO: 3, SEQ ID NO: 5 and SEQ ID NO:7. 

47. A retroviral vector comprising a 
nucleotide sequence selected from the group consisting of 
SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID NO: 5 and SEQ ID NO : 7 . 

48. A host cell comprising at least one 
nucleic acid molecule having a sequence selected from the 
group consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID 
NO: 5 and SEQ ID NO : 7 . 

49. A host cell as claimed in claim 48, 
wherein said host cell is selected from the group 
consisting of bacterial/ fungal, mammalian, insect and 
plant cells. 

50. A host cell as claimed in claim 48, 
wherein said nucleic acid is provided in a plasmid and is 
operably linked to mammalian regulatory elements which 
confer high expression and stability of mRNA transcribed 
from said nucleic acid. 
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51. A host cell as claimed in claim 48, 
wherein said nucleic acid is provided in a plasmid and is 
operably linked to mammalian regulatory control elements 
in reverse anti-sense orientation. 

52. A host animal comprising at least one 
nucleic acid molecule selected from the group consisting 
of SEQ ID NO: 1, SEQ ID NO : 3, SEQ ID NO: 5 and SEQ ID 
NO: 7 . 

53. A host animal as claimed in claim 52, 
wherein said animal harbors a homozygous null mutation in 
its endogenous MOAT gene wherein said mutation has been 
introduced into said mouse or an ancestor of said mouse 
via homologous recombination in embryonic stem cells, and 
further wherein said mouse does not express a functional 
mouse MOAT protein. 

54. The transgenic mouse of claim 53, wherein 
said mouse is fertile and transmits said null mutation to 
its offspring. 

55. The transgenic mouse of claim 53, wherein 
said null mutation has been introduced into an ancestor 
of said mouse at an embryonic stage following 
microinjection of embryonic stem cells into a mouse 
blastocyt. 

56. A method for screening a test compound for 
inhibition of MOAT mediated transport, comprising: 

a) providing a host cell expressing at least 
one MOAT-encoding nucleic acid having a sequence selected 
from the group consisting of SEQ ID NOS : 1, 3, 5, and 7; 
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b) contacting said host cell with a compound 
suspected of inhibiting MOAT-mediated transporter 
activity; and 

c) assessing inhibition of transport mediated 
by said compound. 

57. A method as claimed in claim 56, wherein 
inhibition of MOAT mediated transport is indicated by 
restoration of anticancer drug sensitivity. 

58. A method as claimed in claim 57, wherein 
said inhibition of MOAT mediated transport is indicated 
by a reduction of transporter mediated cellular efflux of 
anticancer agents . 

59. A kit for detecting the presence of MOAT 
encoding nucleic acids in a sample, comprising: 

a) oligonucleotide primers specific for 
amplification of MOAT encoding nucleic acids; 

b) polymerase enzyme ; 

c) amplification buffer; and 

d) MOAT specific DNA for use as a 
positive control. 
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1 MKDIDIGKEY IIPSPGYRSV RERTSTSGTH RDREDSKFRR TRPLECQDAL ETAARAEGLS 
61 LDASKHSQLR ILDEEHPKGK YHHGLSALKP IRTTSKHQHP VDNAGLFSCM TFSWLSSLAR 



121 VAHKKGELSM EDVWSLSKHE SSDVNCRRLE RLWQEELNEV GPDAASLRRV VWIFCRTRLI 
TM1 TH2 

181 LSIVCLMITQ LAGFSGPAFM VKHLLEYTQA TESNLQYSLL LVLGLLLTEI VRSWSLAL.TVJ 
. TH3 

241 ALNYRTGVRL RGAILTMAFK KILKLKNIKE KSLGELINIC SNDGQRMFEA AAVGSLLAGG 
TH4 

301 PWA1LGKIY NVIII/3PTGF LGSAVFILFY PAKMFASRLT AYFRRKCVAA TDERVQKMNE 

TM5 

361 VLTYIKFIKM YAWVKAFSQS VQKIREEERR ILEKAGYFQG ITVGVAPIW VTASWTFSV 

— TH6 
421 HMTLGFDLTA AQAFTWTVF HSMTFAXKVT FFSVKSLSEA SVAVDRFKSL FLMEEVHMIK 

4 81 NKPASPHIKI EMKHATIAWD SSHSSIQNSP KLTPKMKKDK RASRGKKEKV RQLQRTEHQA 

r^MBFl 

541 VLAEQKGHLL LDSDERPSFE EEEGKHIHLG HLRLQRTLHS IDLEIQEGKL VGI CGSVGSG 

A 

601 KTSLISAILG QMTLLEGSIA ISGTFAYVAQ QAWILNATLR DNILFGKEYD EERYHSVLNS 

661 CCLRPDLAIL PSSDLTEIGE RGANLSGO^R QRISLAPALY SDRS IYILDD PLSAXDAHVG 

721 NHIFNSAIRK HUCSKTVLFV THQI/QYXVDC DEVIFMKEGC ITERGTHEEL MNLNGDYATI 

781 FHNIiI/LGET P PVEIKSKKET SGSQKKSODE GPKTGSVKKE KAVKPEEGQL VQLEEKGQGS 

TM7 # 
841 VPWSVYGVYI QAAGGPLAFL VXHALFMLNV GSTAFSTWWI* SYW1KQGSGN TTVTRGNETS 

TM8 

901 VSDSMKDNPH MQYYASIYAL SMAVMLILKA IRGWFVKCT LRASSKLHDE LPRRILRSPM 

TH9 

961 KFFDTTPTGR ILNRFSKDMD EVDVRLPFQA EKFIQKVTLV FFCVGMXAGV FFWFLVAVGP 
TM10 

1021 LVTLFSVLH1 VSKVLIREUC RLONXTQSPF LSHITSSIQG IATIHAYNKG OEFLERTQEL 

TH11 TK12 
1081 LDDNQAPFFL FTCAKRWIAV RLDL.IS1AXJ: TTTGLMTVLM HGQIPP^YAG IAISTAVQLT 



1141 glfqftvrIjA setearftsv ERINHYTKTL SLEAPAiUKN kapspdwpqe gevtfenaem 

j-^NBF2 

1201 RYREKLPLVL KKVSFTIKPK EKIGrV GRTG SGKSS LGMAL FRLVELSGGC IKIDGVRISD 

A 

1261 IGLADLRSKX SIIPQEPVUT SGTVRSKLDP FNQYTEDQIW DALERTHMKE CIAQLPLKLE 

NBF2^ 

1321 SEVMENGDNF SVGERQ LLCI ARALLRHC TI LILDEATAAM DTETDLLIQE TIREAFADCT 

C B 
1381 MLTIAHRLHT VIX3SDRIMVT. AQGQWEFDT PSVLLSNDSS RFYAMFAAAE HKVAVXG 
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# TM1 

1 MGPMDALCGS GELGSKFWDS NLSVHTENPD LTPCFQNSLL AWVPCIYLWV ALPCYLLYLR 

< TM2 ________ 

61 HHCRGYIILS HLSKLRMVLG VLLWCVSWAD LFYSFHGLVH GRAPAPVFFV T P LWGVTML 

TH3 TM4 

121 LATLLIQYER LQGVQSSGVL IIFWFLCWC AIVPFRSKIL LAKAEGEISD PFRFTTFYIH 
TM5 

181 FALVLSALIL ACFREKPPFF SAKNVDPNPY PETSVGFLSR LFFWWFtKMA IYGYRHPLEE 

241 KDLWSLKEED RSQKWQQLL EAWRKQEKQT ARHKASAAPG KNASGEDEVL LGARPRPRKP 

TM6 _-_-__ 
301 SFLKALLATF GSSFL1SACF KLIQDLLSFI NPQLLSILIR FISNPMAPSW WGFLVAGLMF 
TM7 

361 LCSMMQSLIL QHYTHYIFVT GVKFRTGIMG VIYRKALVIT HSVKRASTVG EIVNLMSVDA 

TM8 TM9 
421 QRFMDI-APFL NLLWSAPI/QI ILAIYFLWQN LGPSVLAGVA FMVLLIPLNG AVAVKMRAFO 

481 VKQMKLKDSR IKLMSEILNG IKVTjKLYAWE PSFLKQVEGI RQGELQLLRT AAYLHTTTTF 

TH10 TM11 
541 TWMCSPFDVT LITLWVYVYV DPNNVXDAEK AFVSVSLFNI LRLPLNMLPQ LISNLTQASV 

r^HBFl 

601 SLKRIQQFLS QEELDPQSVE RKTISPGYAI TIHSGTFTWA QDLPPTLHSL DIQVPKGALV 

661 AW GPVGCGK S SLVSAIXGE KEKLEGKVHM KGSVAYVPQQ AWIQNCTLQE HVLFGKALNP 
A 

721 KRYOOTt-EAC ALLADLEMLP GGDQTEIGEK GINLSGGQRQ RVSLARAVYS PA DIFLLD DP 

NBFl^-j C B 

781 LSAVDSHVAK HIFDHYIGPE GVLAGKTRVL VTHGISFLPQ TDFIIVTADG OVSEMGPYPA 

841 LLQRHGS FAN FLCHYAPDED QGKLEDSWTA LEGAEDKEAX LIEDTLSNET DLTDNDPVTY 

901 WQKQFMRQL SALSSDGEGQ GRPVPRKBLG PSEKVQVTEA KADGALTQEE KAAIGTVELS 

™12 «« 

961 VFV7DYAKAVG LCTTLAICLL YVGOSAAAIG AKVWLSAWTN DAKADSRQNN TSLRLGVYAA 
TM13 

1021 IXSXXjQGFXjVM LAAHAMAAGG IQAARVXHQA LLHNKIRSPQ SFFDTTPSGR ILHCFSKDIY 

TM14 TM15 
X081 WDEVIAPVT LKLLNSFFHA ISTI/WIMAS TPLFTWILP LAVXYTLVQR FYAATSRQLK 

1141 RLESVSRSPI YSKFSETVTG ASVXRAYKRS RDFEIISDTK VDANQRSCYP YIISNRWLSI 

TK16 TM17 
1201 CVEFVGHCW LFAAXFAVTG RSSLNPGLVG LSVSYSLQVT FALNWHIRMM SOLESKIVAV 

f>"HBF2 

1261 ERVKEYSKTE TEAPWVYEGS RPPEGWPPRG EVEFRNYSVR TRPGLDLVXR DLSLHVHGGE 

1321 KVGI VGRTGA GKS SKTLCLF RJLEAAKGEI RIDGLHVADI GLHDLRSQLT IIPQDPILFS 
A 

1381 GTLRMHLDPF GSTSEEDIWW ALELSHLKTF VSSQPAGL£>F QCSEGGEH L5 VGQ RQLVCLA 

KBF2-4-J c 
1441 RALLRKSRIL VX*D EATAAID LETDNLIQAT IRTQFDTCTV LTIAHRLHTI MDYTKVLVLD 
B 

1501 KGWAEFDSP ARLIAARG1F YGMARDAGLA 
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MOAT B cDNA AND AMINO ACID SEQUENCE ENCODED THEREBY 



ATGCTGCCCGTGTACCAGGAGGTG AAGCCCAACCCGCTGCAGGACGCG AACATCTGCTCA 
1 + + + + + + 50 

TACGACGGGCACATGGTCCTCCACTTCGGGTTGGGCGACGTCCTGCGCTTGTAGACGAGT 

a MLPVYGEVKPNPLGPANICS - 

CGCGTGTTCTTCTGGTGGCTCAATCCCTTGTTTAAAATTGGCCATAAACGGAGATTAGAG 

61 + + + + + 120 

G CG C AC AAG AAG ACCACCG AG TT AG G G A AC A A ATTTT A ACCG GT ATTTG CCTCT AATCTC 

a RVFFWWLNPLFKIGHKRRLE - 

GAAGATGATATGTATTCAGTGCTGCCAGAAGACCGCTCACAGCACCTTGGAGAGGAGTTG 

121 + + + + -t- + 180 

CTTCT ACTAT AC ATAAGTC ACG ACG GTCTTCTG G CG AG TGTCGTG G AACCTCTCCTC AAC 

a EDDMYSVLPEDRSGHLGEEL - 

C AAG GGTTCTG G G ATAAAG AAG TTTT A AG AG CTG AG AATG ACGC AC AG AAGCCTTCTTTA 

181 ■+- + + + + + 240 

GTTCCC AAG ACCCTATTTCTTC A A AATTCTCG ACTCTT ACTG CGTGTCTTCG G AAG AAAT 

a QGFWDKEVLRAENDAQKPSL - 

ACAAG AG C AATC ATAAAGTGTTACTG G AAATCTTATTTAGTTTTG G G A ATTTTTACGTTA 

241 4- + + + + -f 300 

TGTTCTCGTTAGTATTTCAC AATG ACCTTT AG AAT AAATC AAAACCCTT AAAAATG C AAT 

a TRAIIKCYWKSYLVLG1FTL - 

ATTG AGG AAAGTG CC AAAGT AATCC AG CCC AT ATTTTTG G G AAAAATTATTAATT ATTTT 

301 4- 4- + + — + + 360 

TAACTCCTTTCACGGTTTCATTAGGTCGGGTATAAAAACCCTTTTTAATAATTAATAAAA 

a IEESAKVIGPIFLGKIINYF - 

GAAAATTATG ATCCCATGG ATTCTGTGGCTTTG AACACAGCGTACGCCTATGCCACGGTG 
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361 + ~ + + + + -» 420 

CTTTTAATACTAGGGTACCTAAG ACACCG AAACTTGTGTCGCATGCGG ATACGGTGCCAC 

ENYDPMDSVALNTAYAYATV - 

CTGACTTTTTGCACGCTCATTTTGGCTATACTGCATCACTTATATTTTTATCACGTTCAG 

421 + + + + + + 480 

GACTGAAAAACGTGCGAGTAAAACCGATATGACGTAGTGAATATAAAAATAGTGCAAGTC 

LTFCTLILAILHHLYFYHVQ - 

TGTGCTGGGATGAGGTTACGAGTAGCCATGTGCCATATGATTTATCGGAAGGCACTTCGT 

481 + + -f- + — + + 540 

ACACGACCCTACTCCAATGCTCATCGGTACACGGTATACTAAATAGCCTTCCGTGAAGCA 

CAGMRLRVAMCHMIYRKALR - 

CTTAGTAACATG G CCATG G G G A AG AC AACC ACAG GCCAG ATAG TCAATCTG CTGTCC AAT 
541 + + + + + + 6 qo 

GAATCATTGTACCGGTACCCCTTCTGTTGGTGTCCGGTCTATCAGTTAGACGACAGGTTA 

a LSNMAMGKTTTGQIVNLLSN - 

GATGTGAACAAGTTTGATCAGGTGACAGTGTTCTTACACTTCCTGTGGGCAGGACCACTG 

601 + + 4- + + + 660 

CTACACTTGTTCAAACTAGTCCACTGTCACAAGAATGTGAAGGACACCCGTCCTGGTGAC 

a DVNKFDGVTVFLHFLWAGPL - 

CAGGCGATCGCAGTGACTGCCCTACTCTGGATGGAGATAGGAATATCGTGCCTTGCTGGG 

661 : + + + + + + 720 

GTCCGCTAGCGTCACTGACGGGATGAGACCTACCTCTATCCTTATAGCACGGAACGACCC 

a QAIAVTALLWME1G1SCLAG - 

ATG G C A GTTCT AATC ATTCTCCTG C CCTTG C AAAG CTG I 1 ! I GGGAAGTTGTTCTCATCA 

721 4 + + + + + 780 

TACCGTCAAGATTAGTAAGAGGACGGGAACGTTTCGACAAAACCCTTCAACAAGAGTAGT 

a MAVLIILLPLQSCFGKLFSS - 

CTGAGGAGTAAAACTGCAACTTTCACGGATGCCAGGATCAGGACCATG AATG AAGTTATA 
781 — '■ + + + -4- + + 840 
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G ACTCCTCATTTTG ACGTTG AAAGTGCCTACGGTCCTAGTCCTGGTACTTACTTCAATAT 
a LRSKTATFTDARIRTMNEVI - 

ACTGGTATAAGG ATAATAAAAATGTACGCCTGGGAAAAGTCATTTTCAAATCTTATTACC 

841 — + + -i + + — + 900 

TGACCATATTCCTATTATTTTTACATGCGGACCCTTTTCAGTAAAAGTTTAGAATAATGG 

a TGlRtlKMYAWEKSFSNLIT 

AATTTG AG A A AG A AG G AG ATTTCC AAG ATTCTG AG AAGTTCCTG CCTC AG G G GG ATG AAT 

gOl 4- + + + + + 960 

TTAAACTCTTTC7TCCTCTAAAGGTTCTAAGACTCTTCAAGGACGGAGTCCCCCTACTTA 

a NLRKKEISKILRSSCLRGMN - 

TTGGCTTCGTTTTTCAGTGCAAGCAAAATCATCGTGTTTGTGACCTTCACCACCTACGTG 
961 + + + + + -4 1020 

AACCGAAGCAAAAAGTCACGTTCGTTTTAGTAGCACAAACACTGGAAGTGGTGGATGCAC 

a LASFFSASKIIVFVTFTTYV - 

CTCCTCGGCAGTGTGATCACAGCCAGCCGCGTGTTCGTGGCAGTGACGCTGTATGGGGCT 

1021 + — + + + + 4 1080 

GAGGAGCCGTCACACTAGTGTCGGTCGGCGCACAAGCACCGTCACTGCGACATACCCCGA 

a LLGSVITASRVFVAVTLYGA - 

GTGCG G CTG ACG GTT ACCCTCTTCTTCCCCTC AG CCATTG AG AG GGTGTC AG AG G C AATC 

1081 + 4- + + 4- 4 1140 

CACGCCGACTGCCAATGGGAGAAGAAGGGGAGTCGGTAACTCTCCCACAGTCTCCGTTAG 

a VRLTVTLFFPSAIERVSEAI- 

GTC AG CATCCG AAG AATCC AG ACCTTTTTG CTACTTG ATG AG AT ATC AC AG CG C AACC G T 

1141 + 4- 4- 4- 4- + 12O0 

CAGTCGTAGGC \ I CI I AGGTCTGGAAAAACGATGAACTACTCTATAGTGTCGCGTTGGCA 

a VSIRRIGTFLLLDEISGRNR- 

CAGCTGCCGTCAG ATGGTAAAAAGATGGTGCATGTGCAGG ATTTTACTGCT7TTTGGGAT 

1201 4- 4- -4 + + + 1260 

GTCGACGGCAGTCTACCATTTTTCTACCACGTACACGTCCTAAAATGACGAAAAACCCTA 
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a QLPSDGKKMVHVQOFTAFWO 

AAGGCATCAGAGACCCCAACTCTACAAGGCCTTTCCTTTACTGTCAG ACCTGGCG AATTG 

1261 — + — + + ■+ + — 4 1320 

TTCCGTAGTCTCTGGGGTTGAGATGTTCCGGAAAGGAAATG ACAGTCTGGACCGCTTAAC 

a KASETPTLQGLSFTVRPG EL 

TTAGCTGTGGTCGGCCCCGTGG6AGCAGGGAAGTCATCACTGTTAAGTGCCGTGCTCGGG 

1321 + + + + + ~+ 1380 

AATCGACACCAGCCGGGGCACCCTCGTCCCTTCAGTAGTGACAATTCACGGCACGAGCCC 

a LAVVGPVGAGKSSLLSAVLG - 

GAATTGGCCCCAAGTCACGGGCTGGTCAGCGTGCATGGAAGAATTGCCTATGTGTCTCAG 
1331 + + + + . — + + 1440 

CTTAACCGGGGTTC AGTG CCCG ACC AGTCG C ACGT ACCTTCTT A ACG G AT AC AC AG AGTC 

a ELAPSHGLVSVHGR1AYVSQ - 

CAGCCCTGGGTGTTCTCGGGAACTCTGAGGAGTAATATTTTATTTGGGAAGAAATATGAA 

1441 + + + 4- + -+ 1500 

GTCGGGACCCACAAGAGCCCTTGAGACTCCTCATTATAAAATAAuACCCTTCTTl'ATACTT 

a QPWVFSGTLRSNILFGKKYE - 

AAGG AACG ATATG AAAAAGTC ATAAAG GCTTGTGCTCTG AAAAAG G ATTTACAG CTGTTG 

1501 + + + + + + 1560 

TTCCTT G CTATA C 1 \ \ 1 I CAGTATTTCCGAACACGAG AC I I i I I CCTAAATGTCG ACAAC 

a KERYEKVIKACALKKDLQLL- 

GAGGATGGTGATCTGACTGTGATAGGAGATCGGGGAACCACGCTGAGTGGAGGGCAGAAA 

1561 + 4- + 4 + + 1620 

CTCCTACCACTAGACTGACACTATCCTCTAGCCCCTTGGTGCG ACTCACCTCCCGTCTTT 

a EOGDLTVIGDRGTTLSGGQK - 

GCACGGGTAAACCTTGCAAGAGCAGTGTATCAAGATGCTGACATCTATCTCCTGGACGAT 

1621 + + + + + + 1680 

CGTGCCCATTTGG AACGTTCTCGTCACATAG7TCTACGACTGTAG ATAGAGGACCTGCTA 
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a ARVNLARAVYQDADtYLLDD - 

CCTCTCAGTGCAGTAG ATGCGG AAGTTAGCAG ACACTTGTTCG AACTGTGTATTTGTCAA 

1681 + -- + + -- - + — + + 1740 

GGAGAGTCACGTCATCTACGCCTTCAATCGTCTGTGAACAAGCTT6ACACATAAACAGTT 

a PLSAVDAEVSRHLFELCICG - 

ATTTTGCATGAGAAGATCACAATTTTAGTGACTCATCAG rTGCAGTACCTCAAAGCTGCA 
1741 + + + + + + 1800 

T A AA ACGT ACTCTTCT AGTGTT A A AATC ACTG AGTAGTC AACGTC ATGG AGTTTCG ACGT 

a ILHEKITILVTHGLGYLKAA - 

AGTC AG ATTCTG AT ATTG AA AG ATG GT AA A ATGGTG C AG A AG G G G ACTT AC ACTG AG TTC 

1801 + + + + + + 1860 

TCAGTCTAAGACTATAACTTTCTACCATTTTACCACGTCTTCCCCTGAATGTGACTCAAG 

a SQlLILKDGKMVGKGTYTEF 

CTAAAATCTG GT AT AG ATTTTG G CTCCCTTTT A AAG AAGG AT A ATG AGG A AAGTG A AC A A 

1861 + + + + + + 1920 

G A I ll I AG ACC AT ATCT AAAACCG AG G G AA A ATTTCTTCCT ATT ACTCCTTTC ACTTG TT 

a LKSGIDFGSLLKKDNEESEQ - 

CCTCC AGTTCCAGG AACTCCC AC ACT AAG G AATCGT ACCTTCTC AG AG TCTTCG G TTTG G 

1921 + + + + + + 1980 

G G AG GTC AAG GTCCTTG AG G G TG TG ATTCCTTAGC ATG G AAG AG TCTCAG AAG CC AAACC 

a PPVPGTPTLRNRTFSESSVW - 

TCTCAACAATCTTCTAGACCCTCCTTGAAAGATGGTGCTCTGGAGAGCCAAGATACAGAG 

1981 + + + + + + 2040 

AG AGTTGTT AG AAG ATCTG G G AG G AACTTTCT ACC ACG AG ACCTCTCG GTTCTATGTCTC 

a SOQSSRPSLKDGALESQDTE 

AATGTCCCAGTTACACTATCAGAGGAGAACCGTTCTG AAGGAAAAGTTGGTTTTCAGGCC 

204 1 + + + + + + 2 1 00 

TTACAGGGTCAATGTG ATAGTCTCCTCTTGGCAAGACTTCC I I I I CAACCAAAAGTCCGG 

a NVPVTLSEENRSEGKVGFQA 
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TATAAG A ATT ACTTC AG AG CTG G TG CTC ACTGG ATTG TCTTC ATTTTCCTT ATTCTCC T A 

2101 + — + + + 4- + 2160 

ATATTCTTAATG AAGTCTCGACCACGAGTGACCTAACAG AAGTAAAAGG AA7 AAG AGG AT 

a YKNYFRAGAHWIVFIFLILL 

AACACTGCAGCTCAGGTTGCCTATGTGCTTCAAGATTGGTGGCTTTCATACTGGGCAAAC 

2161 4--— 4- + + + + 2220 

TTGTGACGTCGAGTCCAACGGATACACGAAGTTCTAACCACCGAAAGTATGACCCGTTTG 

a NTAAQVAYVLQOWWLSYWAN - 

A AAC AAAG T ATG CTA AATG TC ACTG T A A ATG G AGG AG G A A ATG T A ACCG AG A AG CT AG AT 

222 1 — 4- + + 4- + + 2280 

TTTGTTTC AT ACG ATTT AC AGTG AC ATTT ACCTCCTCCTTT AC ATTGG CTCTTCG ATCT A 

a KOSMLNVTVNGGGNVTEKLD - 

C i^AACTGGTACTTAGGAATTTATTCAGGTTTAACTGTAGCTACCGTTC IT I T I GGCATA 

2281 4- 4- 4- 4- 4- 4- 2340 

GAATTGACCATGAATCCTTAAATAAGTCCAAATTGACATCGATGGCAAGAAAAACCGTAT 

a LNWYLGIYSGLTVATVLFGI - 

GCAAGATCTCTATTGGTATTCTACGTCCTTGTTAACTCTTCACAAACTTTGCACAACAAA 

2341 + 4- + 4- 4- 4- 2400 

CGTTCTAG AG AT AACCAT AAG ATG C AG G AAC AATTG AG AAGTG TTTG AAACG TG TTG TTT 

a ARSLLVFYVtVNSSOTLHNK - 

ATG TTTG AGTC AATTCTG AAAG CTCCG GT ATTATTCTTTG AT AG A A ATCC AAT AG G AAG A 

2401 + + 4- + + 4 2460 

TACAAACTCAGTTAAGACTTTCGAGGCCATAATAAGAAACTATCTTTAGGTTATCCTTCT 

a MFESILKAPVLFFDRNPIGR - 

AMI lAAATCGTTTCTCCAAAGACAin-GGACACiT"GGATGATTTGCTGCCGCTGACGTTT 

2461 4- + 4- 4- + 4 2520 

TAAAATTTAGCAAAGAGGTTTCTGTAACCTGTGAACCTACTAAACGACGGCGACTGCAAA 

a ILNRFSKDIGHLODLLPLTF 
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TTAG ATTTCATCCAG ACATTGCTACAAGTGGTTGGTGTGGTCTCTGTGGCTGTGGCCGT6 

252 1 + + + + - - + - + 2580 

AATCTAAAGTAGGTCTGTAACG ATGTTCACCAACCACACCAG AGACACCG ACACCGGCAC 

a LDFIQTLLOVVGVVSVAVAV - 

ATTCCTTGG ATCGCAATACCCTTGGTTCCCCTTGGAATCATTTTCATTTTTCTTCGGCG A 

258 1 + — + + — + + w 2640 

TAAGGAACCTAGCGTTATGGGAACCAAGGGGAACCTTAGTAAAAGTAAAAAGAAGCCGCT 

a IPWIAIPlVPLGllFfFLRR- 

TATTTTTTGGAAACGTCAAGAGATGTGAAGCGCCTGGAATCTACAACTCGGAGTCCAGTG 

2641 + + 4- + + 4- 2700 

ATAAAAAACCTTTGCAGTTCTCTACACTTCGCGGACCTTAGATGTTGAGCCTCAGGTCAC 

a YFLETSRDVKRLESTTRSPV - 

TTTTCCCACTTGTC ATCTTCTCTCC AG GGG CTCTG G ACC ATCCG G G C AT AC A AAG C AG AA 

2701 + 4 4- + + + 2760 

AAAAGGGTGAACAGTAGAAGAGAGGTCCCCGAGACCTGGTAGGCCCGTATGTTTCGTCTT 

a FSHLSSSLGGLWT1RAYKAE- 

GAGAGGTGTCAGGAACTGTTTG ATGCACACCAGGATTTACATTCAGAGGCTTGGTTCTTG 

2761 + + 4- 4- + + 2820 

CTCTCCACAGTCCTTGACAAACTACGTGTGGTCCTAAATGTAAGTCTCCGAACCAAGAAC 

a ERCGELFDAHGDLHSEAWFL - 

TTTTTGACAACGTCCCGCTGGTTCGCCGTCCGTCTGGATGCCATCTGTGCCATGTTTGTC 

2821 + + + 4- + 4- 2880 

AAAAACTGTTGCAGGGCGACCAAGCGGCAGGCAGACCTACGGTAGACACGGTACAAACAG 

a FLTTSRWFAVRLDAICAMFV - 

ATC ATCGTTGCCTTTG G GTCCCTG ATTCTG G C AAA AACTCTG G ATG CCGGGC AG GTTG GT 

2881 4- + 4- 4- 4 + 2940 

TAGTAGCAACGGAAACCCAGGGACTAAGACCGTTTTTGAGACCTACGGCCCGTCCAACCA 

a MVAFGSLILAKTLDAGQVG - 

TTGGCACTGTCCTATGCCCTCACGCTCATGGGGATGTTTCAGTGGTGTGTTCGACAAAGT 
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2941 + ^ + 4- + -t 3000 

AACCGTG ACAGG ATACGGG AGTGCG AGTACCCCTACAAAGTC ACCACACAAGCTGTTTCA 

a LALSYALTLMGMFOWCVRGS - 

GCTG AAGTTG AG AATATG ATG ATCTC AGTAG AAAGGGTC ATTG AATAC ACAGACCTTG AA 

3001 + — - + — + -- + + - -- + 3060 

CGACTTCAACTCTTATACTACTAG AGTCATCTTTCCCAGTAACTTATGTGTCTGG AACTT 

a AEVENMMtSVERVIEYTDLE - 

AAAGAAGCACCTTGGGAATATCAGAAACGCCCACCACCAGCCTGGCCCCATGAAGGAGTG 
3061 + + + + + + 3120 

TTTCTTCGTGGAACCCTTATAGTCTTTGCGGGTGGTGGTCGGACCGGGGTACTTCCTCAC 
a KEAPWEYQKRPPPAWPHEGV- 

ATAATCTTTGACAATGTGAACTTCATGTACAGTCCAGGTGGGCCTCTGGTACTGAAGCAT 

3121 + + + + + + 3180 

T ATT AG A AACTGTTAC ACTTG A AG T AC ATGTC AG GTCC ACCCGG AG ACC ATG ACTTCGT A 

a ItFDNVNFMYSPGGPLVLKH 

CTG ACAG C ACTC ATTAAATC AC AAG AAAAG GTTG G C ATTG TG G G A AG AACCG GAGCTGGA 

3181 4- + 4 + + + 3240 

GACTGTCGTGAGTAATTTAGTGTTCTTTTCCAACCGTAACACCCTTCTTGGCCTCGACCT 

a LTAL1KSQEKVG1VGRTGAG 

AAAAGTTCCCTC ATCTCAG CCCTTTTTAG ATTG TC AG AACCCG AAG G TAAAATTTG G ATT 

3241 + 4- + + + + 3300 

TTTTCAAG G G AG TAG AGTCG G G AAAAATCTAAC AGTCTTG G G CTTCC ATTTTA AACCT A A 

a KSSLISALFRLSEPEGKIWI - 

GATAAGATCTTG ACAACTG AAATTGGACTTCACG ATTTAAGG AAGAAAATGTCAATCATA 

3301 + + + + + 4 3360 

CT ATTCT AG AACTG TTG ACTTT AACCTG AAGTG CT A A ATTCCTTCTTTT AC A G TT AG TAT 

a DKILTTEIGLHDLRKKMSII 

CCTCAGGAACCTG I I I I GTTCACTGGAACAATGAGG AAAAACCTGGATCCCTTTAAGG AG 
3361 + 4 4 4 + 4 3420 
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GG AGTCCTTGGACAAAACAAGTG ACCTTGTTACTCCTTTTTGG ACCTAGGGAAATTCCTC 
PQEPVLFTGTMRKNLDPFKE 

CACACGG ATGAGGAACTGTGG AATGCCTTACAAGAGGTACAACTTAAAG AAACCATTG AA 

3421 -- + -i + + + + 3480 

GTGTGCCTACTCCTTGACACCTTACGG AATGTTCTCCATGTTG AATTTCTTTGGTAACTT 

HTDEELWNALGEVGLKETtE - 

GATCTTCCTGGTAAAATGGATACTGAATTAGCAGAATCAGGATCCAATTTTAGTGTTGGA 
3481 + + + ___ + + + 3540 

CTAGAAGGACCATTTTACCTATGACTTAATCGTCTTAGTCCTAGGTTAAAATCACAACCT 
DLPGKMDTELAESGSNFSVG - 

C AAAG AC AACTG GTGTG CCTTG CC AG G G C AATTCTC AGG A A A A ATC AG AT ATTG ATT ATT 

3541 + + + + + + 3600 

GTTTCTGTTGACCACACGGAACGGTCCCGTTAAGAGTCCI I I I I AGTCTATAACTAATAA 

GRGLVCLARAILRKNGILM - 

G ATG AAG CG ACG G C AAATGTG G ATCCAAG AACTG ATG AG TT A AT AC A A AAAAAAATCCG G 

3601 + + + + + + 3660 

CTACTTCGCTGCCGTTTACACCTAGGTTCTTGACTACTCAATTATGTTTTTTTTTAGGCC 

DEATANVDPRTDELIGKKIR - 

G AG AAATTTG CCC ACTG C ACCG TG CT AACC ATTG C AC AC AG ATTG AACACC ATT ATTG AC 

3661 + + + + + 4 3720 

CTCTTTAAACG G GTG ACGTG G C ACG ATTG GTAACGTGTGTCTAACTTGTGGTAAT AACTG 

a EKFAHCTVLTIAHRLNTIID - 

AG CG AC AAG AT A ATGGTTTT AG ATT C AG G AAG ACTG AAAG A AT ATG ATG AG CCG TATGTT 

3721 + + 4- 4- ____„. 4 . 4 3780 

TCGCTGTTCTATTACCAAAATCTAAGTCCTTCTGACTTTCTTATACTACTCGGCATACAA 

a SDKIMVLOSGRLKEYOEPYV - 

TTGCTGCAAAATAAAGAGAGCCTAl i J I ACAAG ATGGTGCAACAACTGGGCAAGGCAGAA 

3781 + + + 4- 4- + 3840 

AACGACGTTTTATTTCTCTCGGATAAAATGTTCTACCACGTTGTTGACCCGTTCCGTCTT 
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a LLQNKESLFYKMVQQLGKAE 

GCCGCTGCCCTCACTGAAACAGCAAAACAGGTATACTTCAAAAGAAATTATCCACATATT 

3841 + - + + + + + 3900 

CGGCG ACGGG AGTGACTTTGTCG I I I ! GTCCATATG AAGTTTTCTTT AATAGGTGTATAA 

a AAALTETAKQVYFKRNYPH1 - 

GGTCACACTGACCACATGGTTACAAACACTTCCAATGGACAGCCCTCG ACCTTAACTATT 

3901 + + + + + + 3960 

CCAGTGTGACTGGTGTACCAATGTTTGTGAAGGTTACCTGTCGGGAGCTGGAATTGATAA 

a GHTDHMVTNTSNGQPSTLTI 

TTCG AG AC AG C ACTG 

3961 + 3975 

AAG CTCTGTCG TG AC 

a FETAL - 
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MOAT C cONA AND AMINO ACID SEQUENCE ENCODED THEREBY 

ATGAAGG ATATCGACATAGGAAAAG AGTATATCATCCCCAGTCCTGGGTATAGAAGTGTG 

1 — - + 4- + + -— + + 60 

TACTTCCTATAGCTGTATCCTTTTCTCATATAGTAGGGGTCAGG ACCCATATCTTCACAC 

a MKDIDIGKEYIIPSPGYRSV 

AGGGAG AGAACCAGCACTTCTGGG ACGCACAGAGACCGTG AAG ATTCCAAGTTCAGG AG A 
61 + + _ + + + A ^20 

TCCCTCTCTTGGTCGTGAAGACCCTGCGTGTCTCTGGCACTTCTAAGGTTCAAGTCCTCT 

a RERTSTSGTHRDREDSKFRR - 

ACTCGACCGTTGGAATGCCAAGATGCCTTGGAAACAGCAGCCCGAGCCGAGGGCCTCTCT 

121 — 4- 4- + + + + 180 

TGAGCTGGCAACCTTACGGTTCTACGGAACCTTTGTCGTCGGGCTCGGCTCCCGGAGAGA 

a TRPLECODALETAARAEGLS - 

CTTGATGCCTCCATGCATTCTCAGCTCAGAATCCTGGATGAGGAGCATCCCAAGGGAAAG 

181 + + 4- + + + 240 

G AACTACG G AGGT ACGT AAG AGTCG AGTCTT AG G ACCTACTCCTCGTAG G GTTCCCTTTC 

a LDASMHSQLRILOEEHPKGK - 

TACCATCATGGCTTGAGTGCTCTGAAGCCCATCCGGACTACTTCCAAACACCAGCACCCA 

241 4- 4- 4- + 4- 4- 300 

ATGGTAGTACCGAACTCACGAGACTTCGGGTAGGCCTGATGAAGGTTTGTGGTCGTGGGT 
a YHHGLSALKPIRTTSKHQHP * 

GTGG AC AATG CTG G G CTTTTTTCCTG T ATG ACTTTTTCG T G G CTTTCTTCTCTG G CCCG T 

301 4- + 4- 4- 4- 4 360 

CACCTGTTACGACCCGAAAAAAGGACATACTGAAAAAGCACCGAAAGAAGAGACCGGGCA 
a VDNAGLFSCMTPSWLSSLAR - 

GTGGCCCACAAG AAGGGGGAGCTCTCAATGG AAGACGTGTGGTCTCTGTCCAAGCACG AG 
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361 + + + — + + „_ + 420 

CACCGGGTGTTCTTCCCCCTCG AGAGTTACCTTCTGCACACCAGAG ACAGGTTCGTGCTC 

a VAHKKGELSMEDVWSLSKHE 

TC7TCTGACGTG AACTGCAGAAG ACTAG AG AGACTGTGGCAAGAAG AGCTGAATGAAGTT 

421 - + + + + + + 480 

AGAAG ACTGCACTTG ACGTCTTCTG ATCTCTCTG AC ACCGTTCTTCTCG ACTT ACTTCAA 

a SSDVNCRRLERLWGEELNEV 

GGGCCAGACGCTGCTTCCCTGCGAAGGGTTGTGTGGATCTTCTGCCGCACCAGGCTCATC 

481 + + + + + + 540 

CCCGGTCTGCGACGAAGGGACGCTTCCCAACACACCTAGAAGACGGCGTGGTCCGAGTAG 

a GPDAASLRRVVWIFCRTRLI - 

CTGTCC ATCG TGTG CCTG ATG ATC ACG C AG CTG G CTG G CTTC AG TGG ACC AG CCTTC ATG 

541 + h + + + + 600 

G AC AGG T AGC AC ACG G ACTACTAGTGCGTCG ACCG ACCG A AGTC ACCTG GTCG G AAGT AC 

a LSiVCLMITQLAGFSGPAFM - 

GTGAAACACCTCTTGGAGTATACCCAGGCAACAGAGTCTAACCTGCAGTACAGCTTGTTG 

601 + + + + 4- + 660 

CACTTTGTG G AG AACCTC AT ATG GGTCCGTTGTCTC AG ATTG G ACG TC ATGTCG AAC AAC 

a VKHLLEYTGATESNLQYSLL - 

TTAGTGCTGGGCCTCCrrCCTGACGGAAATCGTGCGGTCTTGGTCGCrrGCACTGACTTGG 

661 + + + + + + 720 

AATCACGACCCGGAGGAGGACTGCCTTTAGCACGCCAGAACCAGCGAACGTGACTGAACC 

a LVLGLLLTEIVRSWSLALTW - 

GCATTGAATTACCGAACCGGTGTCCGCTTGCGGGGGGCCATCCTAACCATGGCATTTAAG 

721 + + + + + + 780 

CGTAACTT AATG G CTTG GCC AC AG G CG AA.CGCCCCCCGGT AG G ATTG G T ACCG TAAATTC 

a ALNYRTGVRLRGAILTMAFK 

AAGATCCTTAAGTTAAAGAACATTAAAGAGAAATCCCTGGGTGAGCTCATCAACATTTGC 
781 + + + + + -4 840 
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TTCTAGG AATTCAATTTCTTGTAATTTCTCTTTAGGG ACCCACTCG AGTAGTTGTAAACG 

a KILKLKNlKEKSLGELtNIC - 

TCCAACGATGGGCAG AGAATGTTTG AGGCAGCAGCCGTTGGC AGCCTGCTGGCTGG AGG A 

84 1 4- + + + + — - h 900 

AGGTTGCTACCCGTCTCTT ACAAACTCCGTCGTCGGCAACCGTCGG ACG ACCG ACCTCCT 

a SNDGQRMFEAAAVGSLLAGG - 

CCCGTTGTTGCCATCTTAGGCATG ATTTATAATGTAATTATTCTGGGACCAACAGGCTTC 

901 + + + + + + 960 

GGGCAACAACGGTAGAATCCGTACTAAATATTACATTAATAAGACCCTGGTTGTCCGAAG 

a PVVAILGM1YNVIILG PTGF - 

CTGGGATCAGCTGTTTTTATCCTCTTTTACCCAGCAATGATGTTTGCATCACGGCTCACA 

961 + + + + -4 + 1020 

GACCCTAGTCGACAAAAATAGGAGAAAATGGGTCGTTACTACAAACGTAGTGCCGAGTGT 

a LGSAVFILFYPAMMFASRLT - 

GCATATTTCAGGAGAAAATGCGTGGCCGCCACGGATGAACGTGTCCAGAAGATGAATGAA 

1021 + + + + + + 1080 

CGTATAAAGTCCTCTTTTACGCACCGGCGGTGCCTACTTGCACAGGTCTTCTACTTACTT 

a AYFRRKCVAATDERVGKMNE - 

GTTCTTACTTACATTAAATTTATCAAAATGTATGCCTGGGTCAmAAGCATTTTCTCAGAGT 

1081 4- + + + + + 1140 

CAAGAATGAATGTAATTTAAATAGTTTTACATACGGACCCAGTTTCGTAAAAGAGTCTCA 

a VLTYIKF1KMYAWVK AFSOS - 

GTTCAGAAAATCCGCGAGGAGGAGCGTCGGATATTGGAAAAAGCCGGGTACTTCCAGGGT 

1 141 + + + + + - 1200 

CAAGTCTTTTAGGCGCTCCTCCTCGCAGCCTATAACC1 I I I I CGGCCCATG AAGGTCCCA 

a VGKIREEERRILEKAGYFQG 

ATCACTGTGGGTGTGGCTCCCATTGTGGTGGTGATTGCCAGCGTGGTGACCTTCTCTGTT 

1201 + + + + + + 1260 

TAGTGACACCCACACCGAGGGTAACACCACCACTAACGGTCGCACCACTGGAAGAGACAA 
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a tTVGVAPIVVVlASVVTFSV - 

CATATG ACCCTGGGCTTCG ATCTGACAGCAGCACAGGCTTTCAC AGTGGTG ACAGTCTTC 

1261 + + + 4 + 4 1320 

GTATACTGGGACCCGAAGCTAG ACTGTCGTCGTGTCCG AAAGTGTCACCACTGTCAGAAG 

s HMTLGFDLTAAQ AFTVVTVF - 

AA"H CCATGACTTTTGCTTTGAAAGTAACACCGTTTTCAGTAAAGTCCCTCTCAG AAGCC 

1321 + 4- + + + - -+ 1380 

TTAAGGTACTGAAAACGAAACTTTCATTGTGGCAAAAGTCATTTCAGGG AGAGTCTTCGG 

a NSMTFALKVTPFSVKSLSEA - 

TC AGTGGCTGTTG AC AG ATTT A AG AGTTTGTTTCT A ATG G A AG AG GTTCACATG AT A A AG 

1381 + + — + + + 1440 

AGTCACCGACAACTGTCTAAATTCTCAAACAAAGATTACCTTCTCCAAGTGTACTATTTC 

a SVAVDRFKSLFLMEEVHMIK - 

AACAAACCAGCCAGTCCTCACATCAAGATAGAGATGAAAAATGCCACCTTGGCATGGGAC 
1441 + h 4- 4- 4- + 1500 

TTGTTTG GTCG GTC AG G AGTGTAGTTCTATCTCTACTTTTT ACG GTG G A ACCGT ACCCTG 
a NKPASPHIKIEMKNATLAWD - 

TCCTCCCACTCCAGTATCCAGAACTCGCCCAAGCTGACCCCCAAAATGAAAAAAGACAAG 

1601 + + 4- 4- 4 + 1560 

AGG AG G GTG AG GTC ATAG G TCTTG AG CGGGTTCG ACTG G G G G TTTT ACTTTTTTCTGTTC 

a SSHSStQNSPKLTPKMKKDK - 

AGGGCTTCCAGGGGCAAGAAAGAGAAGGTGAGGCAGCTGCAGCGCACTGAGCATCAGGCG 

1561 4- + 4- 4- + -f 1620 

TCCCGAAGGTCCCCG TTCTTTCTCTTCCACTCCGTCG ACGTCGCGTG ACTCGTAGTCCGC 

a RASRGKKEKVRQLQRTEHQA - 

GTGCTGGCAGAGCAG AAAGGCCACCTCCTCCTGGACAGTGACGAGCGGCCCAGTCCCGAA 

1621 + — + 4 + + + 1680 

CACGACCGTCTCGTCTTTCCGGTGG AGGAGGACCTGTCACTGCTCGCCGGGTCAGGGCTT 
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a 



VLAEQKGHLLLDSDERPSPE 



GAGGAAG AA6GCAAGCACATCCACCTGGGCCACCTGCGCTTACAGAGGACACTGCACAGC 



CTCCTTCTTCCGTTCGTGTAGGTGG ACCCGGTGGACGCGAATGTCTCCTGTG ACGTGTCG 



ATCGATCTGGAGATCCAAGAGGGTAAACTGGTTGGAATCTGCGGCAGTGTGGGAAGTGGA 
1741 + + + + + + 1800 

TAGCTAGACCTCTAGGTTCTCCCATTTGACCAACCTTAGACGCCGTCACACCCTTCACCT 

a IDLEIQEGKIVGICGSVGSG - 

AAAACCTCTCTCATTTCAGCCATTTTAGGCCAGATGACGCTTCTAG AGGGCAGCATTGCA 

1801 + + + + + + 1860 

TTTTGGAGAGAGTAAAGTCGGTAAAATCCGGTCTACTGCGAAGATCTCCCGTCGTAACGT 

a KTSLI SAlLGGMTLLEGStA - 

ATC AGTG G AACCTTCGCTT ATGTG G CCC AG C AG GCCTGG ATCCTC AATG CT ACTCTG AG A 

1861 + + + + + + 1920 

TAGTCACCTTGGAAGCGAATACACCGGGTCGTCCGGACCTAGGAGTTACGATGAGACTCT 

a ISGTFAYVAQQAWILNATLR - 

G ACAAC ATCCTGTTTG G G AAG G AAT ATG ATG AAG AAAG AT A C AACTCTG TG CTG AAC AG C 

1921 + + + + + + 1980 

CTGTTGTAG G ACAAACCCTTCCTTATACTACTTCTTTCTATG TTG AG AC ACG ACTTGTCG 

a DNILEGKEYDEERYNSVLNS - 

TGCTGCCTGAGGCCTGACCTGGCCATTCTTCCCAGCAGCGACCTGACGGAGATTGGAGAG 

1981 + + + + + + 2040 

ACGACGGACTCCGGACTGGACCGGTAAGAAGGGTCGTCGCTGGACTGCCTCTAACCTCTC 

a CCLRPOLAILPSSOLTEfGE - 

CGAGGAGCCAACCTGAGCGGTGGGCAGCGCCAGAGG ATCAGCCTTGCCCGGGCCTTGTAT 

2041 + + + + - — + + 2100 

GCTCCTCGGTTGGACTCGCCACCCGTCGCGGTCTCCTAGTCGG AACGGGCCCGGAACATA 

a RGANLSGG QRORISLARALY - 



1681 + + 



1740 



a 



EEEGKHIHLGHLRLGRTLHS 
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AGTGACAGG AGCATCTACATCCTGG ACG ACCCCCTCAGTGCCTTAGATGCCCATGTGGGC 

21 01 4- + + 4 — - + + 2160 

TCACTGTCCTCGTAGATGTAGG ACCTGCTGGGGGAGTCACGG AATCTACGGGTACACCCG 

a SDRSIYILDOPLSALOAHVG - 

AACCACATCTTCAATAGTGCTATCCGGAAACATCTCAAGTCCAAGACAGTTCTGTTTGTT 

2161 + + -»- + + + + 2220 

TTGGTGTAG AAGTTATCACGATAGGCCTTTGTAGAGTTCAGGTTCTGTCAAGACAAACAA 

a NHifNSAIRKHLKSKTVLFV- 

ACCCACCAGTTACAGTACCTGGTTGACTGTGATGAAGTGATCTTCATGAAAGAGGGCTGT 

2221 + + + + + + 2280 

TGGGTGGTCAATGTCATGGACCAACTGACACTACTTCACTAGAAGTACTTTCTCCCGACA 

a THGLGYLVDCDEVIFMKEGC - 

ATT ACG G AAAG AG GC ACCC ATG AG G A ACTG ATG AATTTA A ATG G TG ACT ATG CT ACC ATT 

2281 + + + + + + 2340 

T AATG CCTTTCTCCGTG G GTACTCCTTG ACT ACTT AA ATTT A C C ACTG AT ACG ATG G T A A 

a ITERGTHEELMNLIMGDYATI - 

TTT AAT AACCTGTTG CTGGG AG AG AC ACCG CC AG TTG AG ATC A ATTC AAAAAAG G AAACC 

2341 4- -f 4- + 4- + 2400 

AAATTATTGGACAACGACCCTCTCTGTGGCGGTCAACTCTAGT^ 

a FNNLLLGETPPVEINSKKET- 

AGTGGTTCACAGAAGAAGTCACA^GACAAGGGTCCTAAAACAGGATCAGTAAAGAAGGAA 

2401 4- 4- 4 4- 4- 4 2460 

TCACCAj^GTGTCTTCTTCAGTGTTCTGTTCCCA 

a SGSQKKSQDKGPKTGSVKKE- 

AAAGCAGTAAAGCCAGAGGAAGGGCAGCTTGTGCAGCTGGAAGAGAAAGGGCAGGGTTCA 

2461 4- + + + 4- + 2520 

TTTCGTCATTTCGGTCTCCTTCCCGTCGAACACGTCGACCTTCTCTTTCCCGTCCCAAGT 

a KAVKPEEGQIVGLEEKGOGS - 
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GTGCCCTGGTCAGTATATGGTGTCTACATCCAGGCTGCTGGGGGCCCCTTGGCATTCCTG 

2521 + — - + + + + + 2580 

CACGGGACCAGTCATATACCACAG ATGTAGGTCCGACG ACCCCCGGGGAACCGTAAGGAC 

a VPWSVYGVYi GAAGGPLAFL - 

GTTATTATGGCCCTTTTCATGCTG AATGTAGGCAGCACCGCCTTCAGCACCTGGTGGTTG 
2581 + -- + + + - - +— + 2640 

CAATAATACCGGG AAAAGTACG ACTTACATCCGTC^TGGCGG AAGT.CGTGGACCACCAAC 
a VIMALFMLNVGSTAFSTWWL - 

AGTTACTGGATCAAGCAAGGAAGCGGGAACACCACTGTGACTCGAGGGAACGAGACCTCG 

2641 + + + + + + 2700 

TCAATG ACCTAGTTCGTTCCTTCGCCCTTGTGGTG AC ACTG AG CTCCCTTGCTCTGG AGC 

a SYW1KQGSGNTTVTRGNETS 

GTGAGTGACAGCATGAAGGACAATCCTCATATGCAGTACTATGCCAGCATCTACGCCCTC 

2701 + + + + + + 2760 

CACTCACTGTCGTACTTCCTGTTAGGAGTATACGTCATGATACGGTCGTAGATGCGGGAG 

a VSDSMKDNPHMGYYASIYAL 

TCC ATG G C AGTC ATG CTG ATCCTG AAAG CC ATTCG AG G AGTTG TCTTTG TC AAG G G C ACG 

2761 + + + + + + 2820 

AG GTACCGTC AGT ACG ACT AG G ACTTTCG G TAAG CTCCTC AAC AG AAAC AGTTCCCG TG C 

a SMAVMLILKAIRGVVFVKGT- 

CTGCGAGCTTCCTCCCGGCTGCATGACGAGCTTTTCCGAAGGATCCTTCGAAGCCCTATG 

2821 + + + + + + 2880 

GACGCTCGAAGGAGGGCCGACGTACTGCTCGAAAAGGCTTCCTAGGAAGCTTCGGGATAC 

a LRASSRLHDELFRR1LRSPM - 

AAGTTTTTTGACACGACCCCCACAGGGAGGATTCTCAACAGGTTTTCCAAAGACATGGAT 

2881 + + 4- + + + 2940 

TTC AtAAAAACTGTG CTG G G G G TGTCCCTCCT A AG AGTTGTCC AA A AGGTTTCTGT ACCT A 

a KFFDTTPTGRILNRFSKDMD 

GAAGTTGACGTGCGGCTGCCGTTCCAGGCCGAGATGTTCATCCAGAACGTTATCCTGGTG 
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294 1 + — - + + + — - 4 -t 3000 

CTTCAACTGCACGCCGACGGCAAGGTCCGGCTCTACAAGTAGGTCTTGCAATAGG ACCAC 

a EVDVRLPFQAEMFIQNVILV - 

TTCTTCTGTGTGGG AATG ATCGCAGG AGTCTTCCCGTGGTTCCTTGTGGCAGTGGGGCCC 

3001 + + + + + + 3060 

AAG AAG ACACACCCTTACTAGCGTCCTCAG AAGGGCACCAAGGAACACCG 1 CACCCCGGG 

a FFCVGMIAGVFPWFLVAVGP - 

CTTGTC ATCCTCTTTTC AGTCCTG C AC ATTGTCTCC AG G G TCCTG ATTCG G G AG CTG AAG 

3061 + + + + + + 3120 

G AAC AG TAG G AG AA AAGTC AGG ACGTGT AAC AG AG G TCCC AG G ACT A AG CCCTCG ACTTC 

a LVILFSVLHIVSRVLIRELK - 

CGTCTGGACAATATCACGCAGTCACCTTTCCTCTCCCACATCACGTCCAGCATACAGGGC 

3121 + + + + -+- + 3180 

GCAGACCTGTTATAGTGCGTCAGTGGAAAGGAGAGGGTGTAGTGCAGGTCGTATGTCCCG 

a RLDNITQSPFLSHITSSIQG - 

CTTGCCACCATCCACGCCTACAATAAAGGGCAGGAGTTTCTGCACAGATACCAGGAGCTG 

3181 + + + + + + 3240 

GAACGGTGGTAGGTGCGGATGTTATTTCCCGTCCTCAAAGACGTGTCTATGGTCCTCGAC 

a LATIHAYNKGQEFLHRY Q E L - 

CTG G ATG AC AACC AAG CTCCTTTTTTTTTGTTT ACG TG TG CG ATG CG GTG G CTG GCTGTG 

3241 + + + + + 3300 

GACCTACTGTTGGTTCGAGGAAAAAj\AAACAAATGCACACGCTACGCCACCGACCGACAC 

a LDDNGAPFFLFTCAMRWLAV - 

CGGCTGGACCTCATCAGCATCGCCCTCATCACCACCACGGGGCTGATGATCGTTCTTATG 

3301 + + + + + + 3360 

GCCGACCTGGAGTAGTCGTAGCGGGAGTAGTGGTGGTGCCCCGACTACTAGCAAGAATAC 

a RLDL1SIALITTTGLMIVLM 

CACGGGCAG ATTCCCCCAGCCTATGCGGGTCTCGCCATCTCTTATGCTGTCCAGTTAACG 
3361 4- ■+- + + + + 3420 
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GTGCCCGTCTAAGGGGGTCGGATACGCCCAGAGCGGTAGAGAATACGACAGGTCAATTGC 

a HGQtPPAYAGLAISYAVQLT - 

GGGCTGTTCCAGTTTACGGTCAGACTGGCATCTG AGACAG AAGCTCG ATTCACCTCGGTG 

3421 + .._ _ 4 4 4. 4. + 3480 

CCCGACAAGGTCAAATGCCAGTCTGACCGTAG ACTCTGTCTTCG AGCT AAGTGG AGCCAC 

a GLFQFTVRLASETEAHFTSV - 

G AGAGG ATCAATCACTACATTAAGACTCTGTCCTTGG AAGCACCTGCCAG AATTAAGAAC 

348 1 + + + 4- + + 3540 

CTCTCCTAGTTAGTGATGTAATTCTGAGACAGGAACCTTCGTGGACGGTCTTAATTCTTG 

a ERINHYtKTLSLEAPARIKN - 

AAGGCTCCCTCCCCTGACTGGCCCCAGGAGGGAGAGGTGACCTTTGAGAACGCAG AGATG 

354, 4- + + + + 4- 3600 

TTCCG AG G G AG G G G ACTG ACCG G G G TCCTCCCTCTCC ACT G G A A ACTCTTG CG TCTCTAC 

a KAPSPDWPGEGEVTFENAEM - 

AGGTACCGAGAAAACCTCCCTCTTGTCCTAAAGAAAGTATCCTTCACGATCAAACCTAAA 

3601 -+• + + ■ — + + + 3660 

TCC ATG GCTCTTTTG G AGG G AG A AC AGG ATTTCTTTCAT AG G A AGTG CT AG TTTG G ATTT 

a RYRENLPLVLKKVSFTIKPK - 

GAGAAGATTGGCATTGTGGGGCGGACAGGATCAGGGAAGTCCTCGCTGGGGATGGCCCTC 

3661 + + + + + + 3720 

CTCTTCTAACCGTAACACCCCGCCTGTCCTAGTCCCTTCAGGAGCGACCCCTACCGGGAG 

a EKIGIVGRTGSGKSSLGMAL- 

TTCCGTCTGGTGGAGTTATCTGGAGGCTGCATCAAGATTGATGG AGTGAGAATCAGTG AT 

3721 4- + + + + + 3780 

AAGGCAGACCACCTCAATAGACCTCCGACGTAGTTCTAACTACCTCACTCTTAGTCACTA 

a FRLVELSGGCiKlDGVRlSD 

ATTGGCCTTGCCG ACCTCCGAAGCAAACTCTCTATCATTCCTCAAGAGCCGGTGCTGTTC 

3781 + + + V 4- + 3840 

TAACCGGAACGGCTGGAGGCTTCGTTTGAGAGATAGTAAGGAGTTCTCGGCCACGACAAG 
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a (GLADLRSKLS1IPQEPVLF 

AGTGGCACTGTCAGATCAAATTTGGACCCCTTCAACCAGTACACTGAAGACCAGATTTGG 

384 1 — + - + + + + — - 4 3900 

TCACCGTGACAGTCTAGTTTAAACCTGGGGAAGTTGGTCATGTG ACTTCTGGTCTAAACC 

a SGTVRSNlDPFNQYTEDQfW 

GATGCCCTGGAGAGG ACACACATGAAAG AATGTATTGCTCAGCTACCTCTGAAACTTGAA 

3901 + — + + + + + 3960 

CTACGGGACCTCTCCTGTGTGTACTTTCTTACATAACGAGTCGATGGAGACTTTGAACTT 

a DALERTHMKECIAOLPLKLE - 

TCTGAAGTGATGGAGAATGGGGATAACTTCTCAGTGGGGGAACGGCAGCTCTTGTGCATA 
3961 + + + + + + 4020 

AGACTTCACTACCTCTTACCCCTATTGAAGAGTCACCCCCTTGCCGTCGAGAACACGTAT 

a SEVMENGDNFSVGERGLLCl 

G CTAG AGCCCTG CTCCG CCACTGTAAG ATTCTG ATTTT AG ATG A AG CC ACAGCTG CC ATG 

4021 : — + +- + 4- 4- + 4080 

CG ATCTCG GG ACG AG G CGGTG AC ATTCTAAG ACTA AAtATCT ACTTCG GTGTCG ACG GT AC 

a ARALLRHCKIL1 LDEATAAM - 

GACACAGAGACAGACTTATTGATTCAAGAGACCATCCGAGAAGCATTTGCAGACTGTACC 

4081 + -t- 4- + + + 4140 

CTGTGTCTCTGTCTG AATAACT AAGTTCTCTG GTAG G CTCTTCGT AAACGTCTG AC ATG G 

a DTETDLLiQETI REAFADCT - 

ATGCTG ACCATTGCCC ATCG CCTG C AC ACG GTTCTAG G CTCCG AT AGG ATTATG GTG CTG 

4141 + 4- 4- 4 + + 4200 

TACG ACTGGT AACG G GT AG CG G ACGTG TGCC AAG ATCCG AGG CTATCCT AATACC ACG AC 

a MLTIAHRLHTVLGSORIMVL - 

GCCCAGGGACAGGTGGTGGAGTTTGACACCCCATCGGTCCTTCTGTCCAACGACAGTTCC 

4201 + 4- + + + + 4260 

CGGGTCCCTGTCCACCACCTCAAACTGTGGGGTAGCCAGG AAGACAGGTTGCTGTCAAGG 
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a AQGGVVEFDTPSVLLSNDSS 

CG ATTCTATGCCATGT7TGCTGCTGCAG AG AACAAGGTCGCTGTC AAGGGCTG A 

4261 + ~- + ~ - + + - + - — 4314 

GCTAAGATACGGTACAAACGACGACGTCTCTTGTTCCAGCG ACAGTTCCCG ACT 

a RFYAMFAAACN KVAVKG* 
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MOAT D cDNA AND AMINO ACID SEQUENCE ENCODED THEREBY 
ATGGACGCCCTGTGCGGTTCCGGGGAGCTCGGCTCCAAGTTCTGGG ACTCCAACCTGTCT 

1 4. + + + + + QQ 

TACCTGCGGG ACACGCCAAGGCCCCTCG AGCCG AGGTTCAAG ACCCTGAGGTTGG ACAG A 
a MDALCGSGELGSKFWDSNLS - 

GTGCACACAGAAAACCCGGACCTCACTCCCTGCTTCCAGA/.CTCCCTGCTGGCCTGGGTG 
67 + + + + + + 12 0 

CACGTGTGTCTTTTGGGCCTGGAGTGAGGGACGAAGGTCTTGAGGGACGACCGGACCCAC 
a VHTENPDLTPCFQNSLLAWV 

CCCTGCATCTACCTGTGGGTCGCCCTGCCCTGCTACTTGCTCTACCTGCGGCACCATTGT 
121 + + + -+• + + 180 

GGGACGTAGATGGACACCCAGCGGGACGGGACGATGAACGAGATGGACGCCGTGGTAACA 

a PCIYLWVAEPCYLLYLRHHC - 

CGTGGCTACATCATCCTCTCCCACCTGTCCAAGCTCAAGATGGTCCTGGGTGTCCTGCTG 

181 + + + + + + 240 

GCACCGATGTAGTAGGAGAGGGTGGACAGGTTCGAGTTCTACCAGGACCCACAGGACGAC 

a RGYIILSHtSKLKMVLGVLL - 

TG GTG CGTCTCCTGG GCGG ACCTTTTTTACTCCTTCC ATG G CCTG G TCC ATG G CCG G GCC 

241 + + + + + + 300 

ACCACGCAGAGGACCCGCCTGGAAAAAATGAGGAAGGTACCGGACCAGGTACCGGCCCGG 

a WCVSWADLEYSFHGLVHGRA - 

CCTGCCCCTGTTTTCTTTGTCACCCCCTTGGTGGTGGGGGTCACCATGCTGCTGGCCACC 
301 + + + + + + 360 

GGACGGGGACAAAAGAAACAGTGGGGGAACCACCACCCCCAGTGGTACG ACGACCGGTGG 
a PAPVFFVTPLVVGVTMLLAT - 

CTG CTG AT AC AG T ATG AG CG G CTG C AGG G CG T AC AGTCTTCG G G G G TCCTC ATT ATCTTC 
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361 + + + + + + 420 

G ACGACTATGTCATACTCGCCG ACGTCCCGCATGTCAG AAGCCCCCAGGAGTAATAG AAG 

a LLlQYERLQGVQSSGVLltF - 

TGGTTCCTGTGTGTGGTCTGCGCCATCGTCCCATTCCGCTCCAAGATCCTTTTAGCCAAG 

421 + + + - + + ^ 480 

ACCAAGG ACACACACCAG ACGCGG TAGC AGGGTAAGGCG AGGTTCTAGG aaaatcggttc 

a WFLCVVCAtVPFRSKILLAK - 

gcagagggtgagatctcagaccccttccgcttcaccaccttctacatccactttgccctg 
cgtctcccactctagagtctggggaaggcgaagtggtggaagatgtaggtgaaacgggac 
a aegeisdpfrfttfyihfal- 

GTACTCTCTGCCCTCATCTTGGCCTGCTTCAGGGAGAAACCTCCATTTTTCTCCGCAAAG 

541 h 4 ■+- h + 600 

CATGAGAGACGGGAGTAGAACCGGACGAAGTCCCTCTTTGGAGGTAAAAAGAGGCGTTTC 

a VLSALILACFREKPPFFSAK - 

AATGTCG ACCCT AACCCCT ACCCTG AG ACC AG CG CTG G CTTTCTCTCCCG CCTG IIIllC 
601 + + + + + + 660 

TT AC AG CTG G G ATTG G G G ATG G G ACTCTG G TCG CG ACCG AAAG AG AG G G CG G AC AAAAA G 

a NVDPNPYPETSAGFLSRLFF- 

TGGTGGTTCACAAAGATGGCCATCTATGGCTACCGGCATCCCCTGGAGGAGAAGGACCTC 

661 + + + + + + 720 

ACC ACC AAGTGTTTCT ACCG GT AG AT ACCG ATG G CCGT AG G G G ACCTCCTCTTCCTG G AG 

a WWFTKMAIYGYRHPLEEKDL - 

TGGTCCCTAAAGGAAGAGGACAGATCCCAGATGGTGGTGCAGCAGCTGCTGGAGGCATGG 

721 + + + + + + 780 

ACCAGGGATTTCCTTCTCCTGTCTAGGGTCTACCACCACGTCGTCGACGACCTCCGTACC 

a WSLKEEDRSGMVVQGLLEAW - 

AGGAAGCAGGAAAAGCAGACGGCACGACACAAGGCTTCAGCAGCACCTGGGAAAAATGCC 
781 + + + + + + 840 
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TCCTTCGTCCTTTTCGTCTGCCGTGCTGTGTTCCGAAGTCGTCGTGG ACCCTTTTTACGG 
a RKQEKQTARHKASAAPGKNA - 

TCCGGCGAGG ACGAGGTGCTGCTGGGTGCCCGGCCCAGGCCCCGG AAGCCCTCCTTCCTG 

84 1 + -— + + — . + + ^ 900 

AGGCCGCTCCTGCTCCACGACG ACCCACGGGCCGGGTCCGGGGCCTTCGGG AGG AAGG AC 
a SGEDEVLLGARPRPRKPSFL - 

AAGGCCCTGCTGGCCACCTTCGGCTCCAGCTTCCTCATCAGTGCCTGCTTCAAGCTTATC 
g01 + + + + + + 96O 

TTCCGGGACGACCGGTGGAAGCCGAGGTCGAAGGAGTAGTCACGGACGAAGTTCGAATAG 

a KALLATFGSSFLISACFKLl - 

CAGGACCTGCTCTCCTTCATCAATCCACAGCTGCTCAGCATCCTGATCAGGTTTATCTCC 

961 + + + + + + 1020 

GTCCTGGACGAGAGGAAGTAGTTAGGTGTCGACGAGTCGTAGGACTAGTCCAAATAGAGG 

a QDLtSFINPGLLSILIRFIS - 

AACCCCATGGCCCCCTCCTGGTGGGGCTTCCTGGTGGCTGGGCTGATGTTCCTGTGCTCC 

1021 + + + + + + 1080 

TTGGGGTACCGGGGGAGGACCACCCCGAAGGACCACCGACCCGACTACAAGGACACGAGG 

a NPMAPSWWGFLVAGLMFLCS - 

ATGATGCAGTCGCTGATCTTACAj^CACTATTACCACTACATCTTTGTGACTGGGGTGAAG 

1081 + + + + + + 1140 

TACTACGTCAGCGACTAGAATGTTGTGATAATGGTGATGTAGAAACACTGACCCCACTTC 

a MMQSL1LGHYYHYIFVTGVK - 

TTTCGTACTGGGATCATGGGTGTCATCTACAGGAAGGCTCTGGTTATCACCAACTCAGTC 

1141 + + + - + + + 1200 

AAAGCATGACCCTAGTACCCACAGTAGATGTCCTTCCGAGACCAATAGTGGTTGAGTCAG 

a FRTGIMGVIYRKALVITNSV - 

AAACGTGCGTCCACTGTGGGGGAAATTGTCAACCTCATGTCAGTGGATGCCCAGCGCTTC 
1201 + -+- + + + 4 1260 

TTTGCACGCAGGTGACACCCCCTTTAACAGTTGGAGTACAGTCACCTACGGGTCGCGAAG 
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a KRASTVG EIVNLMSVDAQRF - 

ATGGACCTTGCCCCCTTCCTCAATCTGCTGTGGTCAGCACCCCTGCAGATCATCCTGGCG 

1261 — - + — + + + + + 1320 

TACCTGG AACGGGGG AAGGAGTTAGACGACACCAGTCGTGGGGACGTCTAGTAGGACCGC 

a MDLAPFLNLLWSAPLQI1LA 

ATCTACTTCCTCTGGCAGAACCTAGGTCCCTCTGTCCTGGCTGG AGTCGCTTTCATGGTC 

1321 + + + + + + 1380 

TAGATGAAGGAGACCGTC^TGGATCCAGGGAGACAGGACCGACCTCAGCGAAAGTACCAG 

a IYFLWQNLGPSVLAGVAFMV - 

TTGCTGATTCCACTCAACGGAGCTGTGGCCGTGAAGATGCGCGCCTTCCAGGTAAAGCAA 

1381 + + + + + + 1440 

AACGACTAAGGTGAGTTGCCTCGACACCGGCACTTCTACGCGCGGAAGGTCCATTTCGTT 

a LL1PLNGAVAVKMRAFQVKG - 

ATG AAATTG AAG G ACTCG CG CATC A AG CTG ATG AGTG AG ATCCTG AACG G CATC A AG G TG 

1441 + + + + + + 1500 

TACTTTAACTTCCTGAGCGCGTAGTTCGACTACTCACTCTAGGACTTGCCGTAGTTCCAC 

a MKLKDSRIKLMSEILNGIKV- 

CTGAAGCTGTACGCCTGGGAGCCCAGCTTCCTGAAGCAGGTGGAGGGCATCCGGCAGGGT 

1^01 + + + + + + 1560 

GACTTCGACATGCGGACCCTCGGGTCGAAGGACTTCGTCCACCTCCCGTAGGCCGTCCCA 

a LKLYAWEPSFLKQVEGIRQG - 

GAGCTCCAGCTGCTGCGCACGGCGGCCTACCTCCACACCACAACCACCTTCACCTGGATG 

1561 + + + + + + 1620 

CTCGAGGTCGACG ACGCGTGCCGCCGGATGGAGGTGTGGTGTTGGTGGAAGTGGACCTAC 

a ELQLLRTAAYLHTTTTFTWM - 

TGCAGCCCCTTCCTGGTGACCCTGATCACCCTCTGGGTGTACGTGTACGTGGACCCAAAC 

1621 + + + + 4- — + 1680 

ACGTCGGGGAAGGACCACTGGGACTAGTGGGAGACCCACATGCACATGCACCTGGGTTTG 
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a CSPFLVTLITLWVYVYVDPN - 

AATGTGCTGGACGCCGAG AAGGCCTTTGTGTCTGTGTCCTTGTTTAATATCTTAAGACTT 

1681 + + + +■ + -+ 1740 

TTACACGACCTGCGGCTCTTCCGGAAACACAGACACAGG AACAAATTATAGAATTCTG AA 

a MVLDAEKAFVSVSLFNILRL - 

CCCCTCAACATGCTGCCCCAGTTAATCAGCAACCTG ACTCAGGCCAGTGTGTCTCTGAAA 

1741 + + -f — + — - + + 1 800 

GGGGAGTTGTACGACGGGGTCAATTAGTCGTTGGACTGAGTCCGGTCACACAGAGACTTT 

a PLNMLPGLI SNLTQASVSLK - 

CG G ATCC AGC AATTCCTG AG CCA AG AG G AACTTG ACCCCC AG AGTGTGG AA AG A A AG ACC 

1801 4- 4- + + + + 1860 

GCCTAGGTCGTTAAGGACTCGGTTCTCCTTGAACTGGGGGTCTCACACCTTTCT7TCTGG 

a R1QQFLSQEELDPQSVERKT- 

ATCTCCCCAGGCTATGCCATCACCATACACAGTGGCACCTTCACCTGGGCCCAGGACCTG 

1861 + + + + + + 1920 

TAGAGGGGTCCGATACGGTAGTGGTATGTGTCACCGTGGAAGTGGACCCGGGTCCTGGAC 

a ISPGYAITIHSGTFTWAQDL - 

CCCCCCACTCTGCACAGCCTAGACATCCAGGTCCCGAAAGGGGCACTGGTGGCCGTGGTG 

1921 + + -+- 4- + + 1980 

GGGGGGTGAGACGTGTCGGATCTGTAGGTCCAGGGCTTTCCCCGTGACCACCGGCACCAC 

a PPTLHSLDIQVPKGALVAVV - 

GGGCCTGTGGGCTGTGGGAAGTCCTCCCTGGTGTCTGCCCTGCTGGGAGAGATGGAGAAG 

1981 + + + 4- + + 2040 

CCCGGACACCCGACACCCTTCAGGAGGGACCACAGACGGGACGACCCTCTCTACCTCTTC 

a GPVGCGKSSLVSALLGEMEK - 

CTAGAAGGCAAAGTGCACATG AAGGCATGGATCCAGAACTGCACTCTTCAGGAAAACGTG 

2041 + 4- + 4- + 4 2100 

GATCTTCCGTTTCACGTGTACTTCCGTACCTAGGTCTTGACGTGAGAAGTCCTTTTGCAC 

a LEGKVHMKAWIGNCTLGENV - 
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CTTTTCGGCAAAGCCCTG AACCCCAAGCGCTACCAGCAG ACTCTGGAGGCCTGTGCCTTG 

2101 + + + + -- + -t 2160 

G AAAAGCCGTTTCGGGACTTGGGGTTCGCGATGGTCGTCTG AG ACCTCCGG ACACGGAAC 

a LFGKALNPKRYOQTLEACAL - 

CTAGCTGACCTGG AGATGCTGCCTGGTGGGG ATCAG AC AGAG ATTGG AGAGAAGGGCATT 

2161 + + + + + -i 2270 

GATCGACTGG ACCTCTACG ACGG ACCACCCCTAGTCTGTCTCTAACCTCTCTTCCCGTAA 

a LADLEMLPGGOQTE1GEKGI 

AACCTGTCTGGGGGCCAGCGGCAGCGGGTCAGTCTGGCTCGAGCTGTTTACAGTGATGCC 

2221 4- + + + + + 2280 

TTGGACAGACCCCCGGTCGCCGTCGCCCAGTCAGACCGAGCTCGACAAATGTCACTACGG 

a NLSGGQRGRVSLARAVYSDA - 

G ATATTTTCTTGCTGG ATG ACCC ACTGTCCG CG GTG G ACTCTC ATGTGG CC AAGC AC ATC 

2281 4- 4- 4- 4- + 4 2340 

CTATAAAAGAACGACCTACTGGGTGACAGGCGCCACCTGAGAGTACACCGGTTCGTGTAG 

a DIFLLDDPLSAVDSHVAKHI - 

TTTG ACC ACGTCATCG G G CC AG AAG G CG TG CTG G CAG G C A AG ACG CG AGTG CTG GTG ACG 

2341 4- + + + + + 2400 

AAACTGGTGCAGTAGCCCGGTCTTCCGCACGACCGTCCGTTCTGCGCTCACGACCACTGC 

a FOHVIGPEGVLAGKTRVLVT - 

CACGGCATTAGCTTCCTGCCCCAGACAGACTTCATCATTGTGCTAGCTGATGGACAGGTG 

2401 4- 4- 4 + 4 + 2460 

GTGCCGTAATCGAAGGACGGGGTCTGTCTGAAGTAGTAACACGATCGACTACCTGTCCAC 

a HGISFLPGTDF1IVLADGQV - 

TCTGAGATGGGCCCGTACCCAGCCCTGCTGCAGCGCAACGGCTCCTTTGCCAACTTTCTC 

2461 4- 4- — + + 4 + 2520 

AGACTCTACCCGGGCATGGGTCGGGACGACGTCGCGTTGCCGAGGAAACGGTTGAAAGAG 

a SEMGPYPALLQRNGSFANFL 
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TGCAACTATGCCCCCG ATGAGG ACCAAGGGCACCTGGAGG ACAGCTGG ACCGCGTTGGAA 

2521 + — - + + + - — + -—4 2580 

ACGTTG ATACGGGGGCTACTCCTGGTTCCCGTGG ACCTCCTGTCG ACCTGGCGCAACCTT 

a CNYAPDEDGGHLEDSWTALE - 

GGTGCAGAGG ATAAGGAGGCACTGCTG ATTG AAG ACACACTCAGCAACCACACGG ATCTG 

258 1 + -— + + + + + 2640 

CCACGTCTCCTATTCCTCCGTGACGACTAACTTCTGTGTG AGTCGTTGGTGTGCCTAGAC 

a GAEDKEALLIEDTLSMHTOL - 

ACAGACAATGATCCAGTCACCTATGTGGTCCAGAAGCAGTTTATGAGACAGCTGAGTGCC 

2641 + + + + 4- 4- 2700 

TGTCTGTTACTAGGTCAGTGGATACACCAGGTCTTCGTCAAATACTCTGTCGACTCACGG 

a TDNDPVTYVVOKQFMROLSA - 

CTGTCCTC AG ATG G G G AG GG AC AG G G TCGGCCTGT ACCCCG G AG G C ACCTG G G TCC ATC A 

2701 + + + + 4- + 2760 

GACAGGAGTCTACCCCTCCCTGTCCCAGCCGGACATGGGGCCTCCGTGGACCCAGGTAGT 

a LSSDG EGGGRPVPRRHLGPS - 

GAGAAGGTGCAGGTGACAGAGGCGAAGGCAGATGGGGCACTGACCCAGGAGGAGAAAGCA 

2761 + + + 4- + 4- 2820 

CTCTTCCACGTCCACTGTCTCCGCTTCCGTCTACCCCGTGACTGGGTCCTCCTCTTTCGT 

a EKVQVTEAKADGALTQEEKA - 

G CCATTG G CACTG TG G AGCTC AG TG TGTTCTGG G ATT ATG CC A AG G CCG TG G G G CTCTGT 

2821 + + + + 4- 4 2880 

CGGTAACCGTGACACCTCG AGTCACACAAGACCCTAATACGGTTCCGGCACCCCGAGACA 

a AIGTVELSVFWOYAKAVGtC - 

ACC ACGCTG G CC ATCTGTCTCCTG T ATG TG G G TC A A AGTG CG G CTG CC ATTGG AG CC AJ\T 

2881 + + + + + 4 2940 

TGGTGCGACCGGTAGACAGAGGACATACACCCAGTTTCACGCCG ACGGTAACCTCGGTTA 

a TTLAiCLLYVGQSAAAIGAN 

GTGTGGCTCAGTGCCTGGACAAATGATGCCATGGCAGACAGTAGACAG AACAACACTTCC 



Figure 14G 

SUBSTITUTE SHEET (RULE 26) 



WO 99/49735 



PCT/US99/06644 



42/56 



294 1 + + + + — + + 3000 

CACACCG AGTCACGG ACCTGTTTACTACGGTACCGTCTGTCATCTGTCTTGTTGTG AAGG 

a VWLSAWTNDAMADSRQNNTS 

CTG AGGCTGGGCGTCTATGCTGCTTTAGG AATTCTGC AAGGGTTCTTGGTG ATGCTGGCA 

3001 — - 4- + + + + ~ -+ 3060 

G ACTCCG ACCCGCAG ATACG ACG AAATCCTTA AG ACGTTCCCAAG AACCACTACG ACCGT 

a LRLGVYAALG ILQGFLVMLA - 

GCCATGGCCATGGCAGCGGGTGGCATCCAGGCTGCCCGTGTGTTGCACCAGGCACTGCTG 

3061 + + + + + + 3120 

CGGTACCGGTACCGTCGCCCACCGTAGGTCCGACGGGCACACAACGTGGTCCGTGACGAC 

a AMAMAAGGIQAARVLHGALL- 

C AC AAC AAG ATACG CTCG CC AC AG TCCTTCTTTG AC ACC AC ACC ATC AG GCCG CATCCTG 

3121 + + + + + + 3180 

GTGTTGTTCTATGCGAGCGGTGTCAGGAAGAAACTGTGGTGTGGTAGTCCGGCGTAGGAC 

a HNKIRSPOSFFDTTPSGRtL - 

AACTGCTTCTCC AAG G AC ATCT ATGTCGTTG ATG AG GTTCTG G CCCCTG TC ATCCTC ATG 

3181 + + + + + + 3240 

TTGACGAAGAGGTTCCTGTAGATACAGCAACTACTCCAAGACCGGGGACAGTAGGAGTAC 

a NCFSKD1YVVDEVL.APVILM - 

CTGCTCAATTCCTTCTTCAACGCCATCTCCACTCTTGTGGTCATCATGGCCAGCACGCCG 

3241 + + + + + + 3300 

GACGAGTTAAGGAAGAAGTTGCGGTAGAGGTGAGAACACCAGTAGTACCGGTCGTGCGGC 

a LLNSFFNA1STLVVIMASTP - 

CTCTTCACTGTGGTCATCCTGCCCCTGGCTGTGCTCTACACCTTAGTGCAGCGCTTCTAT 

3301 + + + + s- + 3360 

GAGAAGTGACACCAGTAGGACGGGGACCGACACGAGATGTGGAATCACGTCGCGAAGATA 

a LFTVV1LPLAVLYTLVQRFY - 

GCAGCCACATCACGGCAACTG AAGCGGCTGGAATCAGTCAGCCGCTCACCTATCTACTCC 
3361 + + + + + + 3420 
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CGTCGGTGTAGTGCCGTTG ACTTCGCCGACCTTAGTCAGTCGGCGAGTGGATAGATGAGG 

a AATSRGLKRLESVSRSPIYS - 

CACTTTTCGGAGACAGTG ACTGGTGCCAGTGTCATCCGGGCCTACAACCGCAGCCGGG AT 

3421 -—-i + — - + + -* + 3480 

GTGAAAAGCCTCTGTCACTGACCACGG1CACAG1 AGGCCCGGATGTTGGCGTCGGCCCTA 

a HFStTVTGASVlRAYNRSRD - 

TTTGAGATCATCAGTG ATACTAAGGTGGATGCCAACCAGAGAAGCTGCTACCCCTACATC 

3481 — + + + + + — h 3540 

AAACTCTAGTAGTCACTATGATTCCACCTACGGTTGGTCTCTTCGACGATGGGGATGTAG 

a FEtiSDTKVDANQRSCYPY! - 

ATCTCCAACCGGTGGCTGAGCATCGGAGTGGAGTTCGTGGG6AACTGCGTGGTGCTCTTT 

3541 + + + + + + 3600 

TAGAGGTTGGCCACCGACTCGTAGCCTCACCTCAAGCACCCCTTGACGCACCACGAGAAA 

a (SNRWLSIGVEFVGNCVVLF - 

GCTGCACTATTTGCCGTCATCGGGAGGAGCAGCCTGAACCCGGGGCTGGTGGGCCTTTCT 

3601 + + +■ + + 4 3660 

CGACGTGATAAACGGCAGTAGCCCTCCTCGTCGGACTTGGGCCCCGACCACCCGGAAAGA 

a AALFAV1GRSSLNPGLVGLS - 

GTGTCCTACTCCTTGCAGGTGACATTTGCTCTGAACTGGATGATACGAATGATGTCAGAT 

3661 + + + + + + 3720 

CACAGGATGAGGAACGTCCACTGTAAACGAGACTTGACCTACTATGCTTACTACAGTCTA 

a VSYSLGVTFALNWMIRMMSO - 

TTGGAATCTAACATCGTGGCTGTGGAGAGGGTCAAGGAGTACTCCAAGACAGAGACAGAG 

3721 + + + + + + 3780 

AACCTTAGATTGTAGCACCGACACCTCTCCCAGTTCCTCATGAGGTTCTGTCTCTGTCTC 

a LESNIVAVERVKEYSKTETE - 

GCGCCCTGGGTGGTGGAAGGCAGCCGCCCTCCCGAAGGTTGGCCCCCACGTGGGGAGGTG 
3781 + + + + + 4 3840 

CGCGGGACCCACCACCTTCCGTCGGCGGGAGGGCTTCCAACCGGGGGTGCACCCCTCCAC 
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a APWVVEGSRPPEGWPPRGEV - 

G AGTTCCGG AATTATTCTGTGCGCTACCGGCCGGGCCTAG ACCTGGTGCTG AG AG ACCTG 

384 1 + + — + + + + 3900 

CTCAAGGCCTTAATAAGACACGCG ATGGCCGGCCCGG ATCTGG ACCACG ACTCTCTGG AC 

a EFRNYSVRYRPG LDLVLPDL - 

AGTCTGCATGTGCACGGTGGCGAGAAGGTGGGGATCGTGGGCCGCAGTGGGGCTGGCAAG 
3901 + + + „ + + -f 3960 

TCAGACGTACACGTGCCACCGCTCTTCCACCCCTAGCACCCGGCGTGACCCCGACCGTTC 

a SLHVHGGEKVGIVG RTGAGK - 

TCTTCC ATG ACCCTTTG CCTGTTCCG C ATCCTG G AG G CG G C A A AG G GTG AA ATCCGC ATT 

3961 + + + +- 4 + 4020 

AGAAGGTACTGGGAAACGGACAAGGCGTAGGACCTCCGCCGTTTCCCACTTTAGGCGTAA 

a SSMTLCLFRILEAAKGEIRI - 

GATGGCCTCAATGTGGCAGACATCGGCCTCCATGACCTGCGCTCTCAGCTGACCATCATC 

4021 + + + + + + 4080 

CTACCGGAGTTACACCGTCTGTAGCCGGAGGTACTGGACGCGAGAGTCGACTGGTAGTAG 

a DGLNVAOIGLHDLRSQLT1I - 

CCG C AG G ACCCC ATCCTG TTCTCG G G G ACCCTG CG C ATG AACCTG G ACCCCTTCG G C AG C 

4081 + + + + + + 4140 

GGCGTCCTGGGGTAGGACAAGAGCCCCTGGGACGCGTACTTGGACCTGGGGAAGCCGTCG 

a PQDPILFSGTLRMNLDPFGS- 

TACTCAGAGGAGGACATTTGGTGGGCTTTGGAG CTGTCCC ACCTG CACACGTTTGTG AGC 

4141 + + ~ — + + + + 4200 

ATGAGTCTCCTCCTGTAAACCACCCG AAACCTCG ACAGGGTGGACGTGTGCAAACACTCG 

a YSEEDIWWALELSHLHTFVS 

TCCCAGCCGGCAGGCCTGGACTTCCAGTGCTCAGAGGGCGGGGAGAATCTCAGCGTGGGC 

4201 + + + -4- + -t 4260 

AGGGTCGGCCGTCCGG ACCTG AAGGTCACGAGTCTCCCGCCCCTCTTAGAGTCGCACCCG 
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a SQPAGLDFQCSEGGENLSVG - 

CAG AGGCAGCTCGTGTGCCTGGCCCG AGCCCTGCTCCGCAAG AGCCGCATCCTGGTTTTA 

4261 4 + 4- + ~- -4- + 4320 

GTCTCCGTCGAGCACACGG ACCGGGCTCGGGACGAGGCGTTCTCGGCGTAGG ACCAAAAT 

a GRGLVCLARALLRKSRILVl - 

GACGAGGCCACACCTGCCATCGACCTGGAGACTGACAACCTCATCGAGGCTACCATCCGC 

4321 + + + + + + 4380 

CTGCTCCGGTGTCGACGGTAGCTGGACCTCTGACTGTTGGAGTAGGTCCGATGGTAGGCG 

a DEATAAIDLETDNLlGATtR - 

ACCCAGTTTGATACCTGCACTGTCCTGACCATCGCACACCGGCTTAACACTATCATGGAC 

4381 + + + + + + 4440 

TGGGTCAAACTATGGACGTG ACAGGACTGGTAGCGTGTGGCCGAATTGTGATAGTACCTG 

a TGFDTCTVLTIAHRLNTIMD - 

TACACCAGGGTCCTGGTCCTGGACAAAGGAGTAGTAGCTGAATTTGATTCTCCAGCCAAC 

4441 + 4- 4- 4- 4- 4- 4500 

ATGTG GTCCC AG G ACC AG G ACCTG TTTCCTC ATC ATCG ACTT A A ACTAAG AG GTCG G TTG 

a YTRVLVLDKG VVAEFDSPAN - 

CTC ATTG CAG CT A G AG G C ATCTTCT ACG G G ATG G CC AG AG ATG CTG G ACTTG CCT AA 

4501 + 4- + + 4- 4557 

G AGTAACGTCG ATCTCCGT AG AAG ATGCCCT ACCGGTCTCT ACG ACCTG AACG G ATT 

a LtAARGIFYGMARDAGLA * 



Figure 14K 

SUBSTITUTE SHEET (RULE 26) 



WO 99/49735 



46/56 



PCT/US99/06644 



MOAT E cDNA AND AMINO ACtO SEQUENCE ENCODED THEREBY 
ATGGCCGCGCCTGCTGAGCCCTGCGCGGGGCAGGGGGTCTGGAACCAG ACAGAGCCTG AA 

"J + + + ^ + + gQ 

TACCGGCGCGGACGACTCGGGACGCGCCCCGTCCCCCAGACCTTGGTCTGTCTCGGACTT 

a MAAPAEPCAGQGVWNQTEPE - 

CCTGCCGCCACCAGCCTGCTG AGCCTGTGCTTCCTG AG AACAGCAGGGGTCTGGGTACCC 
61 + + + H + H 1 20 

GGACGGCGGTGGTCGGACGACTCGGACACGAAGGACTCTTGTCGTCCCCAGACCCATGGG 

a PAATSLLSLCFLRTAGVWVP - 

CCCATGTACCTCTGGGTCCTTGGTCCCATCTACCTCCTCTTCATCCACCACCATGGCCGG 

121 + + + + 4- + 180 

GGGTACATGGAGACCCAGGAACCAGGGTAGATGGAGGAGAAGTAGGTGGTGGTACCGGCC 

a PMYLWVLGPIYLLFIHHKGR - 

GGCTACCTCCGGATGTCCCCACTCTTCAAAGCCAAGATGGTGCTTGGATTCGCCCTCATA 

181 -f + + + + + 240 

CCGATGGAGGCCTACAGGGGTGAGAAGTTTCGGTTCTACCACGAACCTAAGCGGGAGTAT 

a GYLRMSPLFKAKMVLGFALI - 

GTCCTGTGTACCT CC AG CGTG G CTGTCG CTCTTTG G AAAATCC AAC AG G G AACG CCTG AG 

241 + + + + + + 300 

C AG G ACACATG G AG GTCG C ACCG AC AG CG AG AAACC I 1 I I AGGTTGTCCCTTGCGG ACTC 

a VLCTSSVAVALWKiOQGTPE- 

GCCCCAGAATTCCTCATTCATCCTACTGTGTGGCTCACCACGATGAGCTTCGCAGTGTTC 

301 + + 4- + + + 360 

CGGGGTCTTAAGGAGTAAGTAGGATGACACACCGAGTGGTGCTACTCGAAGCGTCACAAG 

a APEFLtHPTVWLTTMSFAVF - 

CTGATTCACACCGAGAGGAAAAAGGGAGTCCAGTCATCTGGAGTGCTGTTTGGTTACTGG 

361 — + + + 4- + 4- 420 

GACTAAGTGTGGCTCTCCTTTTTCCCTCAGGTCAGTAG ACCTCACGACAAACCAATGACC 
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a LIHTERKKGVQSSGVLFGYW - 

CTTCTCTGCTTTGTCTTGCCAGCTACCAACGCTGCCCAGCAGGCCTCCGGAGCGGGCTTC 

421 — + + + + + ■* 4 80 

GAAGAGACGAAACAG AACGGTCGATGGTTGCGACGGGTCGTCCGG AGGCCTCGCCCG AAG 

3 LLCFVLPATNAAQQASGAGF 

CAGAGCGACCCTGTCCGCCACCTGTCCACCTACCTATGCCTGTCTCTGGTGGTGGCACAG 
481 + + 4 + „„ + + 540 

GTCTCGCTGGGACAGGCGGTGGACAGGTGGATGGATACGGACAGAGACCACCACCGTGTC 

a GSDPVRHLSTYLCLSLVVAQ - 

TTTGTGCTGTCCTGCCTGGCGGATCAACCCCCCTTCTTCCCTGAAGACCCCCAGCAGTCT 
541 + + + + + + 600 

AAACACGACAGGACGGACCGCCTAGTTGGGGGGAAGAAGGGACTTCTGGGGGTCGTCAGA 

a FVLSCLADQPPFFPEDPOQS - 

AACCCCTGTCCAG AG ACTG G G G C AG CCTTCCCCTCC AAAG CC ACGTTCTGG TGGGTTTCT 

601 + + + + + + 660 

TTGGGGACAGGTCTCTGACCCCGTCGGAAGGGGAGGTTTCGGTGCAAGACCACCCAAAGA 

a NPCPETGAAFPSKATFWWVS - 

G G CCTGGTCTG G AG G G G ATAC AG G AG G CC ACTG AG ACC AAAAG ACCTCTG G TCG CTTG G G 

661 + + + + + + 720 

CCGG ACCAG ACCTCCCCTATG TCCTCCG GTG ACTCTGGTTTTCTG G AG ACC AG CG AACCC 

a GLVWRGYRRPLRPKDLWSLG - 

AGAGAAAACTCCTCAGAAGAACTTGTTTCCCGGCTTGAAuAAGGAGTGGATGAGGAACCGC 

721 + + + + + + 780 

TCTCTTTTGAGGAGTCTTCTTGAACAAAGGGCCGAACTTTTCCTCACCTACTCCTTGGCG 

a RENSSEELVSRLEKEWMRNR- 

AGTGCAGCCCGGAGGCACAACAAGGCAATAGCATTTAAAAGGAAAGGCGGCAGTGGCATG 

781 + + + + — - + + 840 

TCACGTCGGGCCTCCGTGTTGTTCCGTTATCGTAAATTTTCCTTTCCGCCGTCACCGTAC 
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a SAARRHNKAtAFKRKGGSGM - 

AAGGCTCCAGAGACCG AGCCCTTCCTACGGCAAG AAGGGAGCCAGTGGCGCCCACTGCTG 

841 + + + - — + + 4 900 

TTCCGAGGTCTCTGGCTCGGGAAGGATGCCGTTCTTCCCTCGGTCACCGCGGGTGACGAC 

a KAPETEPFLRQEGSGWRPLL - 

AAGGCCATCTGGCAGGTGTTCCATTCTACCTTCCTCCTGGGGACCCTCAGCCTCATCATC 
901 4- + + + + + 960 

TTCCGGTAGACCGTCCACAAGGTAAGATGGAAGGAGGACCCCTGGGAGTCGGAGTAGTAG 
a KAIWGVFHSTFLLGTLSLtl - 

AG TG ATG TCTTC AG GTTC ACTG TCCCC A AG CTG CTC AG CCTTTTCCTGG AGTTT ATTG G T 

961 + 4- 4- 4- 4- 4 1020 

TCACT AC AG AAGTCC AAGTG AC AG G GG TTCG ACG AGTCGG AA A AG G ACCTC AAAT AACC A 

a SDVFRFTVPKLLSLFLEFIG - 

GATCCCAAGCCTCCAGCCTGGAAGGGCTACCTCCTCGCCGTGCTGATGTTCCTCTCAGCC 

1021 + + + + + + 1O80 

CTAGGGTTCGGAGGTCGGACCTTCCCGATGGAGGAGCGGCACGACTACAAGGAG AGTCGG 

a DPKPPAWKGYLLAVLMFLSA - 

TGCCTGCAAACGCTGTTTGAGCAGCAGAACATGTACAGGCTCAAGGTGCCGCAGATGAGG 

1081 + + + + + + 1140 

ACGGACGTTTGCGACAAACTCGTCGTCTTGTACATGTCCGAGTTCCACGGCGTCTACTCC 

a CLQTLFEGQNMYRLKVPGMR - 

TTGCGGTCGGCCATCACTGGCCTGGTGTACAGAAAGGTCCTGGCTCTGTCCAGCGGCTCC 

1141 + + 4- 4- + + 1200 

AACGCCAGCCGGTAGTGACCGGACCACATGTCTTTCCAGGACCGAGACAGGTCGCCGAGG 

a LRSAITGLVYRKVLALSSGS - 

AGAAAGGCCAGTGCGGTGGGTGATGTGGTCAATCTGGTGTCCGTGGACGTGCAGCGGCTG 

1201 + 4- + + + -i 1260 

TCTTTCCGGTCACGCCACCCACTACACCAGTTAG ACCACAGGCACCTGCACGTCGCCGAC 

a RKASAVGDVVNtVSVDVQRL 
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ACCGAGAGCGTCCTCTACCTCAACGGGCTGTGGCTGCCTCTCGTCTGGATCGTGGTCTGC 

1261 + + 4- +- + + 1320 

TGGCTCTCGCAGG AG ATGG AGTTGCCCG ACACCG ACGGAG AGCAG ACCTAGCACCAG ACG 

a TESVLYLNGLWLPLVWIVVC - 

TTCGTCTATCTCTGGCAGCTCCTGGGGCCCTCCGCCCTCACTGCCATCGCTGTCTTCCTG 

1321 + 4 + — + + — + 1380 

AAGCAGATAGAGACCGTCGAGGACCCCGGGAGGCGGGAGTGACGGTAGCGACAGAAGGAC 

a FVYLWGLLGPSALTAIAVFL - 

AG CCTCCTCCCTCTG A ATTTCTTC ATCTCC A AG A A A AGG A ACC ACC ATC AG GAGGAGCAA 

1381 + + -+-— + + + 1440 

TCGGAGGAGGGAGACTTAAAGAAGTAGAGGTTCTTTTCCTTGGTGGTAGTCCTCCTCGTT 

a SLLPLNFFISKKRNHHQEEQ- 

ATGAGGCAGAAGGACTCACGGGCACGGCTCACCAGCTCTATCCTCAGGAACTCGAAGACC 

1441 4- 4- 4- 4- 4- 4- 1500 

TACTCCGTCTTCCTGAGTGCCCGTGCCGAGTGGTCGAGATAGGAGTCCTTGAGCTTCTGG 
a MRQKOSRARLTSS1LRNSKT- 

ATCAAGTTCCATGGCTGGGAGGGAGCCTTTCTGGACAGAGTCCTGGGCATCCGAGGCCAG 

1501 4- + 4- 4- 4- 4- 1560 

TAGTTCAAGGTACCGACCCTCCCTCGGAAAGACCTGTCTCAGGACCCGTAGGCTCCGGTC 
a IKFHGWEGAFLDRVLG1RGQ 

GAGCTGGGCGCCTTGCGGACCTCCGGCCTCCTCTTCTCTGTGTCGCTGGTGTCCTTCCAA 

1561 4- 4- + + 4- + 1620 

CTCGACCCGCGGAACGCCTGGAGGCCGGAGGAGAAGAGACACAGCGACCACAGGAAGGTT 

a ELGALRTSGLLFSVSLVSFQ 

GTGTCTACATTTCTGGTCGCACTGGTGGTGTTTGCTGTCCACACTCTGGTGGCCGAGAAT 

1621 4- 4- 4- 4- 4- 4 1680 

CACAGATGTAAAGACCAGCGTGACCACCACAAACGACAGGTGTGAGACCACCGGCTCTTA 
a VSTFLVALVVFA. VHTLVAEN - 
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GCTATG AATGCAG AGAAAGCCTTTGTGACTCTCACAGTTCTCAACATCCTCAACAAGGCC 

1681 + + + + + + 1740 

CGATACTTACGTCTCTTTCGG AAACACTG AG AGTGTCAAG AGTTGTAGG AGTTGTTCCGG 

a AMNAEKAFVTLTVLNILNKA - 

CAGGCTTTCCTGCCCTTCTCCATCCACTCCCTCGTCCAGGCCCGGGTGTCCTTTGACCGT 

1741 + + + ■+- +■ : 1800 

GTCCGAAAGGACGGGAAGAGGTAGGTG AGGGAGCAGGTCCGGGCCCACAGGAAACTGGCA 

a QAFLPFSIHSLVQARVSFDR - 

CTGGTCACCTTCCTCTGCCTGGAAGAAGTTGACCCTGGTGTCGTAGACTCAAGTTCCTCT 

1801 + + + + + 1860 

GACCAGTGGAAGGAGACGGACCTTCTTCAACTGGGACCACAGCATCTGAGTTCAAGGAGA 

a LVTFLCLEEVOPGVVDSSSS - 

GGAAGCGCTGCCGGGAAGGATTGCATCACCATACACAGTGCCACCTTCGCCTGGTCCCAG 

1861 + + + + + + 1920 

CCTTCGCGACGGCCCTTCCTAACGTAGTGGTATGTGTCACGGTGGAAGCGGACCAGGGTC 

a GSAAGKDCITIHSATFAWSQ - 

G AAAG CCCTCCCTG CCTCC AC AG AAT AAACCTC ACG G TG CCCC AG G G CTG TCTG CTG G CT 

1921 + + + + + 4- 1980 

CTTTCGGGAGGGACGGAGGTGTCTTATTTGGAGTGCCACGGGGTCCCGACAGACGACCGA 

a ESPPCLHRINLTVPQGCLLA- 

GTTGTCGGTCC AGTGGGG GCAG GG AAGTCCTCCCTGCTG TCCGCCCTCCTTG GG G AG CTG 

1981 + + + + + + 2040 

CAACAGCC AGGTCACCCCCGTCCCTTC AGG AG GG ACG ACAG G CGG G AGG A ACCCCTCG AC 

a VVGPVGAGKSSLLSALLGEL - 

TCAAAGGTGGAGGGGTTCGTGAGCATCGAGGGTGCTGTGGCCTACGTGCCCCAGGAGGCC 

2041 + + -+ + + + 2100 

AGTTTCCACCTCCCCAAGCACTCGTAGCTCCCACGACACCGGATGCACGGGGTCCTCCGG 

a SKVEGFVStEGAVAYVPGEA 

TGGGTGCAGAACACCTCTGTGGTAGAGAATGTGTGCTTCGGGCAGGAGCTGGACCCACCC 
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2101 + + + 4- + -f 2160 

ACCCACGTCTTGTGGAG ACACCATCTCTTACACACG AAGCCCGTCCTCGACCTGGGTGGG 

a WVQNTSVVENVCFGQELDPP- 

TGGCTGG AG AGAGTACTAGAAGCCTGTGCCCTGCAGCCAG ATGTGGACAGCTTCCCTGAG 

2161 4- + + •+ -4 — - -t 2220 

ACCG ACCTCTCTCATGATCTTCGGACACGGGACGTCGGTCTACACCTGTCG AAGGG ACTC 

a WLERVLEACALQPOVDSFPE - 

G G AATCC AC ACTTC A ATTG G G G AG C AG G GC ATG AATCTCTCCG G AG G CC AG A AG C AG CG G 
2221 + + +....+ + 4- 2280 

CCTTAGGTGTGAAGTTAACCCCTCGTCCCGTACTTAGAGAGGCCTCCGGTCTTCGTCGCC 
a GiHTSlGEQGMNLSGGQKQR- 

CTGAGCCTGGCCCGGGCTGTATACAGAAAGGCAGCTGTGTACCTGCTGGATGACCCCCTG 

2281 + -f + + + + 2340 

G ACTCGG ACCG G G CCCG AC AT ATGTCTTTCCGTCG ACAC ATG G ACG ACCT ACTG G G G G AC 

a LSLARAVYRKAAVYLLDDPL - 

GCGGCCCTGGATGCCCACGTTGGCCAGCATGTCTTCAACCAGGTCATTGGGCCTGGTGGG 

2341 + + + 4- 4- 4 2400 

CGCCGGGACCTACGGGTGCAACCGGTCGTACAGAAGTTGGTCCAGTAACCCGGACCACCC 

a AALDAHVGGHVFNGVIGPGG - 

CTACTCCAGGGAACAACACGGATTCTCGTGACGCACGCACTCCACATCCTGCCCCAGGCT 

2401 4- + + 4- + 4- 2460 

GATGAGGTCCCTTGTTGTGCCTAAGAGCACTGCGTGCGTGAGGTGTAGGACGGGGTCCGA 

a LLGGTTRILVTHALHILPGA - 

GATTGGATCATAGTGCTGGCAAATGGGGCCATCGCAGAGATGGGTTCCTACCAGGAGCTT 

2461 + 4 + 4- + 4- 2520 

CTAACCTAGTATCACGACCGTTTACCCCGGTAGCGTCTCTACCCAAGGATGGTCCTCGAA 

a DWIIVLANGAIAEMGSYQEL - 

CTGCAGAGGAAGGGGGCCCTCGTGTGTCTTCTGGATCAAGCCAGACAGCCAGGAGATAGA 
2521 + 4 + 4 + 4 2580 
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GACGTCTCCTTCCCCCGGGAGCACACAGAAGACtTAGTTCGGTCTGTCGGTCCTCTATCT 

a LGRKGALVCLLDQARQPGDR 

GGAGAAGGAGAAACAGAACCTGGGACCAGCACCAAGGACCCCAGAGGCACCTCTGCAGGC 

2581 + + + - + —4 4 2640 

CCTCTTCCTCTTTGTCTTGG ACCCTGGTCGTGGTTCCTGGGGTCTCCGTGG AG ACGTCCG 

a GEGETEPGTSTKDPRGTSAG - 

AGGAGGCCCGAGCTTAGACGCGAGAGGTCCATCAAGTCAGTCCCTGAGAAGGACCGTACC 

2641 + + + + + + 2700 

TCCTCCGGGCTCGAATCTGCGCTCTCCAGGTAGTTCAGTCAGGGACTCTTCCTGGCATGG 

a RRPELRRERSIKSVPEKDRT - 

ACTTCAGAAGCCCAGACAGAGGTTCCTCTGGATGACCCTGACAGGGCAGGATGGCCAGCA 

2701 4- + + + + + 2760 

TGAAGTCTTCGGGTCTGTCTCCAAGGAGACCTACTGGGACTGTCCCGTCCTACCGGTCGT 

a TSEAQTEVPLDDPDRAGWPA - 

GGAAAGGACAGCATCCAATACGGCAGGGTGAAGGCCACAGTGCACCTGGCCTACCTGCGT 

2761 4- + + 4- + + 2820 

CCTTTCCTGTCGT AG GTTATG CCGTCCC ACTTCCG G TG TC ACGTG G ACCG G ATG G ACG C A 

a GKDSIQYGRVKATVHLAYLR - 

GCCGTGGGCACCCCCCTCTGCCTCTACGCACTCTTCCTCTTCCTCTGCCAGCAAGTGGCC 

2821 + 4- + 4- 4- 4 2880 

CGGCACCCGTGGGGGGAGACGGAGATGCGTGAGAAGGAGAAGGAGACGGTCGTTCACCGG 

a AVGTPLCLYALFLFLCQQVA - 

TCCTTCTG CCG G G G CT ACTGGCTG AGCCTGTGG G CG G ACG ACCCTG C AG T AGG TG G G C A G 

2881 — 4- 4 + 4- + + 2940 

AGGAAGACGGCCCCGATGACCGACTCGGACACCCGCCTGCTGGGACGTCATCCACCCGTC 

a SFCRGYWLSLWADDPAVGGQ - 

CAGACGCAGGCAGCCCTGCGTGGCGGGATCTTCGGGCTCCTCGGCTGTCTCCAAGCCATT 
2941 4- 4- 4- 4- + 4 3000 

GTCTGCGTCCGTCGGGACGCACCGCCCTAGAAGCCCGAGGAGCCGACAGAGGTTCGGTAA 
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a QTQAALRGGIFGLLGCLQAI - 

GGGCTGTTTGCCTCCATGGCTGCGGTGCTCCTAGGTGGGGCCCGGGCATCCAGGTTGCTC 

3001 + -t- + + + + 3060 

CCCGACAAACGGAGGTACCGACGCCACG AGGATCCACCCCGGGCCCGTAGGTCCAACG AG 

a GLFASMAAVLLGGARASRLL 

TTCCAGAGGCTCCTGTGGGATGTGGTGCGATCTCCCATCAGCTTCTTTGAGCGGACACCC 

3061 + + + + + 3120 

AAGGTCTCCG AGG ACACCCTACACCACGCTAGAGGGTAGTCGAAG AAACTCGCCTGTGGG 

a FGRLLWDVVRSPISFFERTP - 

ATTGGTCACCTGCTAAACCGCTTCTCCAAGGAGACAGACACGGTTGACGTGGACATTCCA 

3121 + + + + + + 3180 

TAACC AGTGG ACG ATTTGG CG A AG AG GTT CCTCTGTCTGTG CC A ACTGC ACCTGT AAG G T 

a IGHLLNRFSKETOTVOVDIP- 

GACAAACTCCGGTCCCTGCTGATGTACGCCTTTGGACTCCTGGAGGTCAGCCTGGTGGTG 

3181 + + + + + + 3240 

CTGTTTGAGGCCAGGGACGACTACATGCGGAAACCTGAGGACCTCCAGTCGGACCACCAC 

a DKLRSLLMYAFGLLEVSLVV- 

GCAGTGGCTACCCC^CTGGCCACrrGTGGCCATCCTGCCACTGTTTCTCCTCTACGCTGGG 

3241 + + 4- + - + + 3300 

CGTC ACCG ATG G G G TG ACCG GTG AC ACCG G TAG G ACG GTG AC AAAG AG G AG ATG C G A CCC 

a AVATPLATVAILPLFLLYAG - 

TTTCAGAGCCTGTATGTGGTTAGCTCATGCCAGCTGAGACGCTTGGAGTCAGCCAGCTAC 

3301 + + + + + + 3360 

AAAGTCTCG G AC AT AC ACC AATCG AGT ACG G TCG ACTCTG CG AACCTC AG TCG G TCG ATG 

a FGSLYVVS SCGLRRLESASY - 

TCG TCTGTCTGCTCCC AC ATG GCTGAGACGTTCCAGGGCAG C AC AGTG GTCCGGG C ATT C 

3361 + + + + + 4 3420 

AGCAGACAGACGAGGGTGTACCGACTCTGCAAGGTCCCGTCGTGTCACCAGGCCCGTAAG 
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a SSVCSHMAETFOGSTVVRAF 

CG AACCCAGGCCCCTCTTGTGGCTCAG AACAATGCTCGCGTAG ATG AAAGCCAGAGG ATC 
3421 + - + + + + + 3480 

GCTTGGGTCCGGGG AG AACACCG AGTCTTGTTACGAGCGCATCTACTTTCGGTCTCCTAG 

a RTGAPLVAGNNARVDESGRI 

AGTTTCCCGCGACTGGTGGCTGACAGGTGGCTTGCGGCCAATGTGGAGCTCCTGGGGAAT 

3481 + + — ^ + + « + + 3540 

TCAAAGGGCGCTGACCACCGACTGTCCACCGAACGCCGGTTACACCTCGAGGACCCCTTA 

a SFPHLVADRWLAANVELLGN - 

GGCCTGGTGTTTGCAGCTGCCACGTGTGCTGTGCTGAGCAAAGCCCACCTCAGTGCTGGC 
3541 + 1 1 -f- + + 3600 

CCGGACCACAAACGTCGACGGTGCACACGACACGACTCGTTTCGGGTGGAGTCACGACCG 

a GLVFAAATCAVLSKAHLSAG - 

CTCGTGGGCTTCTCTGTCTCTGCTGCCCTCCAGGTGACCCAGGCACTGCAGTGGGTTGTT 

3601 + +■ + + + + 3660 

GAGCACCCGAAGAGACAGAGACGACGGGAGGTCCACTGGGTCCGTG ACGTCACCCAACAA 

a LVGFSVSAALGVTQALGWVV - 

CGCAACTGGACAGACCTAGAGAACAGCATCGTGTCAGTGGAGCGGATGCAGGACTATGCC 

3661 + + + + + + 3720 

G CGTTG ACCTGTCTG G ATCTCTTGTCGT AG C AC AG TC ACCTCG CCT ACG TCCTG AT ACG G 

a RNWTOLENSIVSVERMQOYA - 

TGGACGCCCAAGGAGGCTCCCTGGAGGCTGCCCACATGTGCAGCTCAGCCCCCCTGGCCT 

3721 + + + + + + 3780 

ACCTGCGGGTTCCTCCGAGGGACCTCCGACGGGTGTACACGTCGAGTCGGGGGGACCGGA 

a WTPKEAPWRLPTCAAQPPWP - 

CAGGGCGGGCAGATCGAGTTCCGGGACTTTGGGCTAAGATACCGACCTGAGCTCCCGCTG 

3781 + + + + + h 3840 

GTCCCGCCCGTCTAGCTCAAGQCCCTG AAACCCG ATTCTATGGCTGG ACTCG AGGGCG AC 

a QGGQIEFRDFGLRYRPELPL - 
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GCTGTGCAGGGCGTGTCCCTCAAGATCCACGCAGG AGAG AAGGTGGGCATCGTTGGCAGG 

3841 + + + + + 4 3900 

CGACACGTCCCGCACAGGGAGTTCTAGGTGCGTCCTCTCTTCCACCCGTAGCAACCGTCC 

a AVGGVSLKIHAGEKVG1VGR - 

ACCGGGGCAGGGAAGTCCTCCCTGGCCAGTGGGCTGCTGCGGCTCCAGGAGGCAGCTGAG 

3901 + + + + + — ^ 3960 

TGGCCCCGTCCCTTCAGGAGGGACCGGTCACCCGACGACGCCGAGGTCCTCCGTCGACTC 

a TGAGKSSLASGLLRLGEAAE - 

G G TGG G ATCTG G ATCG ACG G G GTCC CC ATTG CCC AC G TG G G G CTG C AC AC ACTG CG CTCC 

3961 + + + + 4- + 4020 

CCACCCTAG ACCTAG CTG CCCC AGG GGT A ACGGGTG C ACCCCG ACGTGTGTG ACG CG AG G 

a GGIWCDGVPIAHVGLHTLRS - 

AG G ATC AG C ATC ATCCCCC AG G ACCCC ATCCTGTTCCCTG G CTCTCTG CG G ATG AACCTC 

4021 + + + + + + 4080 

TCCTAGTCGTAGTAGGGGGTCCTGGGGTAGGACAAGGGACCGAGAGACGCCTACTTGGAG 

a RISIIPQDPILFPGSLRMNL - 

GACCTGCTGCAGGAGCACTCGGACGAGGCTATCTGGGCAGCCCTGGAGACGGTGCAGCTC 

4081 + + + + + + 4140 

CTGGACGACGTCCTCGTGAGCCTGCTCCGATAGACCCGTCGGGACCTCTGCCACGTCGAG 

a DLLQEHSDEA1WAALETVOL - 

AAAGCCTTGGTGGCCAGCCTGCCCGGCCAGCTGCAGTACAAGTGTGCTGACCGAGGCGAG 

4141 + + + + + + 4200 

TTTCGGAACCACCGGTCGGACGGGCCGGTCGACGTCATGTTCACACGACTGGCTCCGCTC 

a KALVASLPGOLQYKCADRGE - 

GACCTGAGCGTGGGCCAGAAACAGCTCCTGTGTCTGGCACGTGCCCTTCTCCGGAAGACC 

4201 + + + + + + 4260 

CTGGACTCGCACCCGGTCTTTGTCGAGGACACAGACCGTGCACGGGAAGAGGCCTTCTGG 

a DLSVGGKQLLCl ARALLRKT - 
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CAGATCCTCATCCTGGACG AGGCTACTGCTGCCGTGGACCCTGGCACGG AGCTGCAG ATG 

4261 + + + + - + + 4320 

GTCTAGGAGTAGG ACCTGCTCCG ATG ACG ACGGCACCTGGG ACCGTGCCTCGACGTCTAC 

a QILILDEATAAVDPGTELGM - 

CAGGCCATGCTCGGGAGCTGGTTTGCACAGTGCACTGTGCTGCTCATTGCCCACCGCCTG 

4321 + + + — + + + 4380 

GTCCGGTACGAGCCCTCGACCAAACGTGTCACGTGACACGACGAGTAACGGGTGGCGGAC 

a QAMLGSWFAQCTVLLIAHRL 

CGCTCCGTGATGGACTGTGCCCGGGTTCTGGTCATGGACAAGGGGCAGGTGGCAGAGAGC 

4381 + + -+- + + + 4440 

GCGAGGCACTACCTGACACGGGCCCAAGACCAGTACCTGTTCCCCGTCCACCGTCTCTCG 

a RSVMDCARVLVMDKGQVAES- 

GGCAGCCCGGCCCAG CTGCTGGCCCAGAAGGGCCTGTTTTACAG ACTGGCCCAGGAGTCA 

4441 + + + + + + 4500 

CCGTCGG GCCG GGTCG ACG ACCG G GTCTTCCCG G AC A A A ATG TCTG ACCG G G TCCTC AG T 

a GSPAQLLAQKGuFYRLAQES - 

GGCCTGGTCTGA 

4501 + - 4512 

CCGGACCAGACT 

a G L V * - 
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<210> 1 
<211> 4231 
<212> DNA 

<213> Homo sapiens 



<400> 
ggacaggcgt 
ggccaccgcc 
gcccgtgtac 
gttcttctgg 
tgatatgtat 
gttctgggat 
agcaatcata 
ggaaagtgcc 
ttatgatccc 
tttttgcacg 
tgggatgagg 
taacatggcc 
gaacaagttt 
gatcgcagtg 
agttctaatc 
gagtaaaact 
tataaggata 
gagaaagaag 
ttcgtttttc 
cggcagtgtg 
gctgacggtt 
catccgaaga 
gccgtcagat 
atcagagacc 
tgtggtcggc 
ggccccaagt 
ctgggtgttc 
acgatatgaa 
tggtgatctg 



ggcggccgga 
gcctgatcag 
caggaggtga 
tggctcaatc 
tcagtgctgc 
aaagaag 1 1 1 
aagtgttact 
aaagtaatcc 
atggattctg 
ctcattttgg 
ttacgagtag 
atggggaaga 
gatcaggtga 
actgccctac 
attctcctgc 
gcaactttca 
ataaaaatgt 
gagatttcca 
agtgcaagca 
atcacagcca 
accctcttct 
atccagacct 
ggtaaaaaga 
ccaactctac 
cccgtgggag 
cacgggctgg 
tcgggaactc 
aaagtcataa 
actgtgatag 



gccccagcat 
cgcgaccccg 
agcccaaccc 
ccttgtttaa 
cagaagaccg 
taagagctga 
ggaaatctta 
agcccatatt 
tggctttgaa 
ctatactgca 
ccatgtgcca 
caaccacagg 
cagtgttctt 
tctggatgga 
ccttgcaaag 
cggatgccag 
acgcctggga 
agattctgag 
aaatcatcgt 
gccgcgtgtt 
tcccctcagc 
ttttgctact 
tggtgcatgt 
aaggcctttc 
cagggaagtc 
tcagcgtgca 
tgaggagtaa 
aggcttgtgc 
gagatcgggg 



ccctgct tga 
gcccgcgccc 
gctgcaggac 
aattggccat 
ctcacagcac 
gaatgacgca 
tttagttttg 
tttgggaaaa 
cacagcgtac 
tcacttatat 
tatgatttat 
ccagatagtc 
acacttcctg 
gataggaata 
ctgttttggg 
gatcaggacc 
aaagtcattt 
aagttcctgc 
gtttgtgacc 
cgtggcagtg 
cattgagagg 
tgatgagata 
gcaggat t tt 
ctttactgtc 
atcactgtta 
tggaagaatt 
tattttattt 
tctgaaaaag 
aaccacgc tg 



ggtccaggag 
gccccgcccg 
gcgaacatct 
aaacggagat 
cttggagagg 
cagaagcctt 
ggaattttta 
attattaatt 
gcctatgcca 
ttttatcacg 
cggaaggcac 
aatctgctgt 
tgggcaggac 
tcgtgccttg 
aagttgttct 
atgaatgaag 
tcaaatctta 
ctcaggggga 
ttcaccacct 
acgctgtatg 
gtgtcagagg 
tcacagcgca 
actgcttttt 
agacctggcg 
agtgccgtgc 
gcctatgtgt 
gggaagaaa t 
gatttacagc 
agtggagggc 



cggagcccgc 
gcaagatgct 
gctcacgcgt 
tagaggaaga 
agttgcaagg 
ctttaacaag 
cgttaattga 
attttgaaaa 
cggtgctgac 
ttcagtgtgc 
ttcgtct tag 
ccaatgatgt 
cactgcaggc 
ctgggatggc 
catcactgag 
ttataactgg 
t taccaattt 
tgaatttggc 
acgtgctcct 
gggctgtgcg 
caatcgtcag 
accgtcagct 
gggataaggc 
aat tgttagc 
tcggggaatt 
c tcagcagcc 
atgaaaagga 
tgttggagga 
agaaagcacg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



iu n ~J to *H! 



H-o ,.. ii ''J 1 - ft™:: ibi ti 



ggtaaacctt 
cagtgcagta 
gcatgagaag 
gattctgata 
atctggtata 
agttccagga 
acaatcttct 
cccagttaca 
gaattacttc 
tgcagc tcag 
aagtatgcta 
ctggtactta 
atctctattg 
tgagtcaatt 
aaatcgtttc 
tttcatccag 
ttggatcgca 
tttggaaacg 
ccacttgtca 
gtgtcaggaa 
gacaacgtcc 
cgt tgccttt 
actgtcctat 
agttgagaat 
agcaccttgg 
ctt tgacaat 
agcactcatt 
ttccctcatc 
gatcttgaca 
ggaacc tgtt 
ggatgaggaa 
tcctggtaaa 
acaactggtg 
agcgacggca 
att tgcccac 
caagataatg 
gcaaaataaa 
tgccctcact 
cactgaccac 
gacagcactg 
tttggactat 
caagatgcta 



gcaagagcag 
gatgcggaag 
atcacaatt t 
ttgaaagatg 
gattttggct 
actcccacac 
agaccc tec t 
ctatcagagg 
agagctggtg 
gt tgee tatg 
aatgtcactg 
ggaatttatt 
gtattctacg 
ctgaaagctc 
tccaaagaca 
acattgetae 
atacccttgg 
tcaagagatg 
tcttctctcc 
ctgtttgatg 
cgctggttcg 
gggtccctga 
gccctcacgc 
atgatgatc t 
gaatatcaga 
gtgaac ttca 
aaatcacaag 
tcagccctt t 
actgaaat tg 
ttgttcactg 
ctgtggaatg 
atggatactg 
tgccttgcca 
aatgtggatc 
tgcaccgtgc 
gttttagat t 
gagagectat 
gaaacagcaa 
atggttacaa 
tgaatccaac 
gtaaaccaca 
gttcatttga 



tgtatcaaga 
ttagcagaca 
tagtgactca 
gtaaaatggt 
cccttttaaa 
taaggaatcg 
tgaaagatgg 
agaacegtte 
ctcactggat 
tgc ttcaaga 
taaatggagg 
caggtttaac 
tccttgttaa 
eggtattatt 
t tggacactt 
aagtggttgg 
ttccccttgg 
tgaagegect 
aggggctctg 
cacaccagga 
ccgtccgtct 
t tctggcaaa 
tcatggggat 
cagtagaaag 
aacgcccacc 
tgtacagtcc 
aaaaggttgg 
t tagattgtc 
gac t tcacga 
gaacaatgag 
ccttacaaga 
aattagcaga 
gggcaattct 
caagaactga 
taaccattgc 
caggaagac t 
t t tacaagat 
aacaggtata 
acac ttccaa 
caaaatgtca 
ttgtactttt 
atatttctcc 



tgetgacate 
cttgttcgaa 
tcagttgcag 
gcagaagggg 
gaaggataat 
tacc ttc tea 
tgctctggag 
tgaaggaaaa 
tgtcttcatt 
ttggtggctt 
aggaaatgta 
tgtagctacc 
ctct tcacaa 
ctttgataga 
ggatgatttg 
tgtggtctct 
aatcattttc 
ggaatctaca 
gaccatccgg 
t ttacat tea 
ggatgecate 
aactctggat 
gtttcagtgg 
ggtcattgaa 
accagcctgg 
aggtgggcct 
cattgtggga 
agaacccgaa 
tttaaggaag 
gaaaaacctg 
ggtacaactt 
atcaggatcc 
caggaaaaat 
tgagt taata 
acacagat tg 
gaaagaatat 
ggtgcaacaa 
cttcaaaaga 
tggacagccc 
agtccgttcc 
ttttactttg 
c 



tatctcctgg 
ctgtgtattt 
tacctcaaag 
acttacactg 
gaggaaagtg 
gagtcttegg 
agecaagata 
gttggttttc 
ttccttattc 
tcatactggg 
accgagaagc 
gttcttt ttg 
actttgeaca 
aatccaatag 
ctgccgctga 
gtggctgtgg 
atttttcttc 
acteggagtc 
gcatacaaag 
gaggct tggt 
tgtgccatgt 
geegggcagg 
tgtgttcgac 
tacacagacc 
ccccatgaag 
ctggtactga 
agaaceggag 
ggtaaaattt 
aaaatgtcaa 
gatcccttta 
aaagaaacca 
aattttagtg 
cagatat tga 
caaaaaaaaa 
aacaccatta 
gatgagcegt 
ctgggcaagg 
aattatccac 
tcgaccttaa 
gaaggcattt 
gcaacaaata 



acgatcctct 
gtcaaatt tt 
ctgeaagtea 
agttcctaaa 
aacaacctcc 
tttggtctca 
cagagaatgt 
aggectataa 
tcctaaacac 
caaacaaaca 
t agate t taa 
gcatagcaag 
acaaaatgtt 
gaagaatt tt 
cgtttttaga 
ccgtgattcc 
ggegatat t t 
cagtgttt tc 
cagaagagag 
tcttgttttt 
t tgtcatcat 
ttggtttggc 
aaagtgctga 
1 1 gaaaaaga 
gagtgataat 
agcatctgac 
c t ggaaaaag 
ggattgataa 
tcatacctca 
aggagcacac 
ttgaagatct 
ttggacaaag 
ttattgatga 
teegggagaa 
ttgacagega 
atgtt ttget 
cagaagccgc 
atattggtca 
ctattttcga 
tccac tagtt 
tttatacata 



1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4231 



<210> 2 

<211> 1325 

<212> PRT 

<213> Homo sapiens 



<400> 2 



Met 


Leu 


Pro 


Val 


Tyr 


Gin 


Glu 


Val 


Lys 


Pro 


Asn 


Pro 


Leu 


Gin 


Asp 


Ala 


1 








5 










10 










15 




Asn 


He 


Cys 


Ser 
20 


Arg 


Val 


Phe 


Phe 


Trp 
25 


Trp 


Leu 


Asn 


Pro 


Leu 
30 


Phe 


Lys 


He 


Gly 


His 
35 


Lys 


Arg 


Arg 


Leu 


Glu 
40 


Glu 


Asp 


Asp 


Met 


Tyr 
45 


Ser 


Val 


Leu 


Pro 


Glu 
50 


Asp 


Arg 


Ser 


Gin 


His 

55 


Leu 


Gly 


Glu 


Glu 


Leu 
60 


Gin 


Gly 


Phe 


Trp 


Asp 


Lys 


Glu 


Val 


Leu 


Arg 


Ala 


Glu 


Asn 


Asp 


Ala 


Gin 


Lys 


Pro 


Ser 


Leu 


65 










70 










75 










80 


Thr 


Arg 


Ala 


He 


He 
85 


Lys 


Cys 


Tyr 


Trp 


Lys 
90 


Ser 


Tyr 


Leu 


Val 


Leu 
95 


Gly 


He 


Phe 


Thr 


Leu 
100 


He 


Glu 


Glu 


Ser 


Ala 
105 


Lys 


Val 


He 


Gin 


Pro 
110 


He 


Phe 
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Gly 




He 






115 




Val 


Ala 




Asn 




130 






Thr 


Leu 


He 


Leu 


14 5 








Cys 


Ala 


Gly 


Met 


Lys 


Ala 


Leu 


Arg 








180 


Gin 


lie 


Val 


Asn 






195 




Th.ir 


Val 


Phe 


Leu 




210 






Val 


Thr 


Ala 


Leu 


225 








Met 


Ala 


Val 


Leu 


Leu 


Phe 


Ser 


Ser 








2 60 


lie 


Arg 


Thr 


Met 






275 




ly x. 


Ala 


Tm 


Glu 




290 






Lys 


Glu 


He 


Ser 


305 








Leu. 


Ala 


Ser 


Phe 


Thr 


Thr 


Tyr 


Val 








340 


Val 


Ala 


Val 


Thr 






355 




Phe 


Pro 


Ser 


Ala 




370 






Arg 


lie 


Gin 


Thr 


385 








Gin 


Leu 


Pro 


Ser 


Ala 


Phe 


Trp 


Asp 








420 


Phe 


Thr 


Val 


Arg 






435 




Ala 


Gly 


Lys 


Ser 




450 






Ser 


His 


Gly 


Leu 


465 








Gin 


Pro 


Tno 


Val 


Lys 


Lys 


Tyr 


Glu 








500 


Leu 


T 


Lys 


Asp 






515 




Gly 


Asp 


Arg 


Gly 




530 






Leu 


A 1 = 




fi 1 a 

xA -L ex. 


545 








Pro 


Leu 


Ser 


Ala 


Cys 


He 


Cys 


Gin 








580 


Gin 


Leu 


Gin 


Tyr 






595 




Gly 


Lys 


Met 


Val 




610 







He Asn Tyr Phe 
120 

Thr Ala Tyr Ala 
135 

Ala He Leu His 
150 

Arg Leu Arg Val 
165 

Leu Ser Asn Met 

Leu Leu Ser Asn 
200 

His Phe Leu Trp 
215 

Leu Trp Met Glu 
230 

He He Leu Leu 
245 

Leu Arg Ser Lys 

Asn Glu Val He 
280 

Lys Ser Phe Ser 
295 

Lys He Leu Arg 
310 

Phe Ser Ala Ser 
325 

Leu Leu Gly Ser 

Leu Tyr Gly Ala 
360 

He Glu Arg Val 
375 

Phe Leu Leu Leu 
390 

Asp Gly Lys Lys 
405 

Lys Ala Ser Glu 

Pro Gly Glu Leu 
440 

Ser Leu Leu Ser 
455 

Val Ser Val His 
470 

Phe Ser Gly Thr 
485 

Lys Glu Arg Tyr 

Leu Gin Leu Leu 

520 

Thr Pro Leu Ser 

535 

Val Tyr Gin Asp 
550 

Val Asp Ala Glu 
565 

He Leu His Glu 

Leu Lys Ala Ala 
600 

Gin Lys Gly Thr 
615 



Glu 


Asn 


Tvr 


Asp 


Tyr 


Ala 


Thr 


Val 








140 


Hi s 


Leu 


Tvr 


Phe 






155 




Ala 


Met 


Cys 


His 




170 






Ala 


Met 


Gly 


Lys 


185 








Asp 


Val 


Asn 


Lys 


Ala 


Gly 


Pro 


Leu 








220 


He 


Gly 


He 


Ser 






23 5 




Pro 


Leu 


Gin 


Ser 




250 






Thr 


Ala 


Thr 


Phe 


2 65 








Thr 


Gly 


He 


Arg 


Asn 


Leu 


He 


Thr 








300 


Ser 


Ser 


Cys 


Leu 






315 




ys 


He 


He 


Val 




330 






Val 


He 


Thr 


Ala 


345 








Val 


Arg 


Leu 


Thr 


Ser 


Glu 


Ala 


He 








380 


Asp 


Glu 


He 


Ser 






395 




Met 


Val 


His 


Val 




410 






Thr 


Pro 


Thr 


Leu 


425 








Leu 


Ala 


Val 


Val 


Ala 


Val 


Leu 


Gly 








460 


Glv 


Arg 


He 


Ala 






475 




Leu 


Arg 


Ser 


Asn 




490 






Glu 


Lys 


Val 


He 


505 








Glu 


Asp 


Gly 


Asp 


Gly 


Gly 


Gin 


Lys 








540 


Ala 


Asp 


He 


Tyr 






555 




Val 


Ser 


Arg 


His 




570 






Lys 


He 


Thr 


He 


585 








Ser 


Gin 


He 


Leu 


Tyr 


Thr 


Glu 


Phe 



620 



Pro 


Met 


Asp 


Ser 


125 








Leu 


Thr 


Phe 


Cys 


Tyr 


His 


Val 


Gin 








160 


Met 


He 


TVr 

y 


Arg 






175 




Thr 


Thr 


Thr 


Gly 




190 






Phe 


*^3? 


Gin 


Val 


2 0 5 








Gin 


Ala 


He 


Ala 


Cys 


Leu 


Ala 


Gly 








240 


Cys 


Phe 


Gly 


Lys 






2 5 5 




Thr 


Asp 


Ala 


Arg 




270 






He 


He 


Lys 


Met 


285 








Asn 


L eu 


Arg 


Lys 


Arg 


Gly 


Met 


Asn 








320 


Phe 


Val 


Thr 


Phe 






335 




Ser 


Arg 


Val 


Phe 




350 






Val 


Thr 


Leu 


Phe 


365 








Val 


Ser 


He 


Arg 


Gin 


Arg 


Asn 


Arg 








400 


Gin 


Asp 


Phe 


Thr 






415 




Gin 


Gly 


Leu 


Ser 




430 






Gly 


Pro 


Val 


Gly 


445 








Glu 


Leu 


Ala 


Pro 


Tyr 


Val 


Ser 


Gin 








480 


He 


Leu 


Phe 


Glv 






495 




Lys 


Ala 


Cvs 


Ala 




510 






Leu 


Thr 


Val 


He 


525 








Ala 


Arg 


Val 


Asn 


Leu 


Leu 


Asp 


Asp 








560 


Leu 


Phe 


Glu 


Leu 






575 




Leu 


Val 


Thr 


His 




590 






He 


Leu 


Lys 


Asp 


605 








Leu 


Lys 


Ser 


Gly 
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He 


Asp 


Phe 


Gly 


Ser 


Leu 


Leu 


Lys 


Lys 


Asp 


Asn 


Glu 


Glu 


Ser 


Glu 


Gin 


625 










630 










635 










640 


Pro 


Pro 


Val 


Pro 


Gly 


Thr 


Pro 


Thr 


Leu 


Arg 


Asn 


Arg 


Thr 


Phe 


Ser 


Glu 










645 










650 










655 




Ser 


Ser 


Val 


Tr"D 


Ser 


Gin 


Gin 


Ser 


Ser 


Arg 


Pro 


Ser 


Leu 


Lys 


Asp 


Glv 








660 










665 










670 






Ala 




Glu 


Ser 


Gin 


Asp 


Thr 


Glu 


Asn 


Val 


Pro 


Val 


Thr 


Leu 


Ser 


Glu 






675 










680 










685 








Glu 


Asn 


Arg 


Ser 


Glu 


Gly 


Lys 


Val 


Gly 


Phe 


Gin 


Ala 


Tvr 


Lys 


Asn 


Tvr 




690 










695 










700 










Phe 


Arg 


Ala 


Gly 


Ala 


His 


Tm 


He 


Val 


Phe 


He 


Phe 


Leu 


He 


Leu 


Leu 


705 










710 










715 










720 


Asn 


Thr 


Ala 


Ala 


Gin 


Val 


Ala 


Tvr 


Val 


Leu 


Gin 


Asp 




Trp 


Leu 


Ser 










725 










730 










735 








Ala 


Asn 


Lys 


Gin 


Ser 


Met 


Leu 


Asn 


Val 


Thr 


Val 


Asn 


Gly 


Gly 








740 










745 










750 






Gly 


Asn 


Val 


Thr 


Glu 


Lys 


Leu 


Asp 


Leu 


Asn 




Tvr 


Leu 


Gly 


He 


Tyr 






755 










760 










765 








Seir 


Gly 


Leu 


Thr 


Val 


Ala 


Thr 


Val 


Leu 


Phe 


Glv 

v_j j_ y 


He 


Ala 


Arg 


Ser 


Leu 




770 










115 










780 










Leu 


Val 


Phe 


Tvr 


Val 


Leu 


Val 


Asn 


Ser 


Ser 


Gin 


Thr 


Leu 


His 


Asn 


Lys 


785 










790 










795 










800 


Met 


Phe 


Glu 


Ser 


He 


Leu 


Lys 


Ala 


Pro 


Val 


Leu 


Phe 


Phe 


Asp 


Arg 


Asn 










805 










810 










815 




Pro 


He 


Gly 


Arg 


He 


Leu 


Asn 


Arg 


Phe 


Ser 


Lys 


Asp 


He 


Glv 


His 


Leu 








820 










825 










830 






Asp 


Asp 


Leu 


Leu 


Pro 


Leu 


Thr 


Phe 


Leu 


Asp 


Phe 


He 


Gin 


Thr 


Leu 


Leu 






835 










840 










845 








Gin 


Val 


Val 


Gly 


Val 


Val 


Ser 


Val 


Ala 


Val 


Ala 


Val 


He 


Pro 


Trp 


He 




850 










855 










860 










Ala 


He 


Pro 


Leu 


Val 


Pro 


Leu 


Glv 


He 


He 


Phe 


He 


Phe 


Leu 


Arg 


Arg 


865 










870 










875 










880 




Phe 


Leu 


Glu 


Thr 


Ser 


Arg 


Asp 


Val 


Lys 


Arg 


Leu 


Glu 


Ser 


Thr 


Thr 










885 










890 










895 




Arg 


Ser 


Pro 


Val 


Phe 


Ser 


His 


Leu 


Ser 


Ser 


Ser 


Leu 


Gin 


Glv 


Leu 


Trp 








900 










905 










910 






Thr 


He 


Arg 


Ala 


Tvr 


Lys 


Ala 


Glu 


Glu 


Arg 


Cys 


Gin 


Glu 


Leu 


Phe 


Asp 






915 










920 










925 








Ala 


Hi s 


Gin 


Asp 


Leu 


His 


Ser 


Glu 


Ala 


Tno 


Phe 


Leu 


Phe 


Leu 


Thr 


Thr 




930 










935 










940 










Ser 


Arg 


Trp 


Phe 


Ala 


Val 


Arg 


Leu 


Asp 


Ala 


He 


Cys 


Ala 


Met 


Phe 


Val 


945 










950 










955 










960 


He 


He 


Val 


Ala 


Phe 


Gly 


Ser 


Leu 


He 


Leu 


Ala 


Lys 


Thr 


Leu 


Asp 


Ala 










965 










970 










975 




Gly 


Gin 


Val 


Gly 


Leu 


Ala 


Leu 


Ser 


Tyr 


Ala 


Leu 


Thr 


Leu 


Met 


Gly 


Met 








980 










985 










990 






Phe 


Gin 


Trp 


Cys 


Val 


Arg 


Gin 


Ser 


Ala 


Glu 


Val 


Glu 


Asn 


Met 


Met 


He 






995 










1000 








1005 






Ser 


Val 


Glu 


Arg 


Val 


He 


Glu 


Tyr 


Thr 


Asp 


Leu 


Glu 


Lys 


Glu 


Ala 


Pro 




1010 








1015 








1020 








Trp 


Glu 


Tyr 


Gin 


Lys 


Arg 


Pro 


Pro 


Pro 


Ala 


Trp 


Pro 


His 


Glu 


Gly Val 


1025 








1030 








1035 








104< 


He 


He 


Phe 


Asp 


Asn 


Val 


Asn 


Phe 


Met 


Tyr 


Ser 


Pro 


Gly 


Gly 


Pro 


Leu 










1045 








1050 








1055 


Val 


Leu 


Lys 


His 


Leu 


Thr 


Ala 


Leu 


He 


Lys 


Ser 


Gin 


Glu 


Lys 


Val 


Gly 








1060 








1065 








1070 




He 


Val 


Gly 


Arg 


Thr 


Gly 


Ala 


Gly 


Lys 


Ser 


Ser 


Leu 


He 


Ser 


Ala 


Leu 






1075 








1080 








1085 






Phe 


Arg 


Leu 


Ser 


Glu 


Pro 


Glu 


Gly 


Lys 


He 


Trp 


He 


Asp 


Lys 


He 


Leu 




1090 








1095 








1100 








Thr 


Thr 


Glu 


He 


Gly 


Leu 


His 


Asp 


Leu 


Arg 


Lys 


Lys 


Met 


Ser 


He 


He 


1105 








1110 








1115 








1121 


Pro 


Gin 


Glu 


Pro 


Val 


Leu 


Phe 


Thr 


Gly 


Thr 


Met 


Arg 


Lys 


Asn 


Leu 


Asp 



1125 1130 1135 
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Pro 


Phe 


Lys 


Glu 


His 


Thr 


Asp 


Glu 


Glu 


Leu 


Trp 


Asn 


Ala 


Leu 


Arg 


Glu 








1140 








1145 








1150 




Val 


Gin 


Leu 


Lys 


Glu 


Thr 


He 


Glu 


Asp 


Leu 


Pro 


Gly 


Lys 


Met 


Asp 


Thr 






1155 








1160 








1165 






Glu 


Leu. 


Ala 


Glu 


Ser 


Gly 


Ser 


Asn 


Phe 


Ser 


Val 


Gly 


Gin 


Arg 


Gin 


Leu 




1170 








1175 








1180 








Val 


Cys 


Leu 


Ala 


Arg 


Ala 


He 


Leu 


Arg 


Lys 


Asn 


Gin 


He 


Leu 


He 


He 


1185 








1190 








1195 








1201 


Asp 


Glu 


Ala 


Thr 


Ala 


Asn 


Val 


Asp 


Pro 


Arg 


Thr 


Asp 


Glu 


Leu 


He 


Gin 










1205 








1210 








1215 


Lys 


Lys 


He 


Arg 


Glu 


Lys 


Phe 


Ala 


His 


Cys 


Thr 


Val 


Leu 


Thr 


He 


Ala 








1220 








1225 








1230 




His 


Arg 


Leu 


Asn 


Thr 


He 


He 


Asp 


Ser 


Asp 


Lys 


He 


Met 


Val 


Leu 


Asp 






1235 








1240 








1245 






Ser 


Gly 


Arg 


Leu 


Lys 


Glu 


Tyr 


Asp 


Glu 


Pro 


Tyr 


Val 


Leu 


Leu 


Gin 


Asn 




1250 








1255 








1260 








Lys 


Glu 


Ser 


Leu 


Phe 


Tyr 


Lys 


Met 


Val 


Gin 


Gin 


Leu 


Gly 


Lys 


Ala 


Glu 


1265 








1270 








1275 








1281 


Ala 


Ala 


Ala 


Leu 


Thr 


Glu 


Thr 


Ala 


Lys 


Gin 


Val 


Tyr 


Phe 


Lys 


Arg 


Asn 










1285 








1290 








1295 


Tyr 


Pro 


His 


He 


Gly 


His 


Thr 


Asp 


His 


Met 


Val 


Thr 


Asn 


Thr 


Ser 


Asn 








1300 








1305 








1310 




Gly 


Gin 


Pro 


Ser 


Thr 


Leu 


Thr 


He 


Phe 


Glu 


Thr 


Ala 


Leu 









1315 1320 1325 



<210> 3 

<211> 5838 

<212> DNA 

<213> Homo sapiens 



<400> 3 

ccgggcaggt ggctcatgct cgggagcgtg gttgagcggc tggcgcggtt gtcctggagc 60 

aggggcgcag gaattctgat gtgaaactaa cagtctgtga gccctggaac ctccgctcag 120 

agaagatgaa ggatatcgac ataggaaaag agtatatcat ccccagtcct gggtatagaa 18 0 

gtgtgaggga gagaaccagc acttctggga cgcacagaga ccgtgaagat tccaagttca 240 

ggagaactcg accgttggaa tgccaagatg ccttggaaac agcagcccga gccgagggcc 3 00 

tctctcttga tgcctccatg cattctcagc tcagaatcct ggatgaggag catcccaagg 3 60 

gaaagtacca tcatggcttg agtgctctga agcccatccg gactacttcc aaacaccagc 42 0 

acccagtgga caatgctggg cttttttcct gtatgacttt ttcgtggctt tcttctctgg 480 

cccgtgtggc ccacaagaag ggggagctct caatggaaga cgtgtggtct ctgtccaagc 540 

acgagtcttc tgacgtgaac tgcagaagac tagagagact gtggcaagaa gagctgaatg 600 

aagttgggcc agacgctgct tccctgcgaa gggttgtgtg gatcttctgc cgcaccaggc 660 

tcatcctgtc catcgtgtgc ctgatgatca cgcagctggc tggcttcagt ggaccagcct 720 

tcatggtgaa acacctcttg gagtataccc aggcaacaga gtctaacctg cagtacagct 780 

tgttgttagt gctgggcctc ctcctgacgg aaatcgtgcg gtcttggtcg cttgcactga 840 

cttgggcatt gaattaccga accggtgtcc gcttgcgggg ggccatccta accatggcat 900 

ttaagaagat ccttaagtta aagaacatta aagagaaatc cctgggtgag ctcatcaaca 9 60 

tttgctccaa cgatgggcag agaatgtttg aggcagcagc cgttggcagc ctgctggctg 1020 

gaggacccgt tgttgccatc ttaggcatga tttataatgt aattattctg ggaccaacag 1080 

gcttcctggg atcagctgtt tttatcctct tttacccagc aatgatgttt gcatcacggc 1140 

tcacagcata tttcaggaga aaatgcgtgg ccgccacgga tgaacgtgtc cagaagatga 12 0 0 

atgaagttct tacttacatt aaatttatca aaatgtatgc ctgggtcaaa gcattttctc 1260 

agagtgttca aaaaatccgc gaggaggagc gtcggatatt ggaaaaagcc gggtacttcc 132 0 

agggtatcac tgtgggtgtg gctcccattg tggtggtgat tgccagcgtg gtgaccttct 1380 

ctgttcatat gaccctgggc ttcgatctga cagcagcaca ggctttcaca gtggtgacag 1440 

tcttcaattc catgactttt gctttgaaag taacaccgtt ttcagtaaag tccctctcag 1500 

aagcctcagt ggctgttgac agatttaaga gtttgtttct aatggaagag gttcacatga 1560 

taaagaacaa accagccagt cctcacatca agatagagat gaaaaatgcc accttggcat 162 0 

gggactcctc ccactccagt atccagaact cgcccaagct gacccccaaa atgaaaaaag 1680 

acaagagggc ttccaggggc aagaaagaga aggtgaggca gctgcagcgc actgagcatc 174 0 

aggcggtgct ggcagagcag aaaggccacc tcctcctgga cagtgacgag cggcccagtc 180 0 

ccgaagagga agaaggcaag cacatccacc tgggccacct gcgcttacag aggacactgc 18 60 

acagcatcga tctggagatc caagagggta aactggttgg aatctgcggc agtgtgggaa 1920 

gtggaaaaac ctctctcatt tcagccattt taggccagat gacgcttcta gagggcagca 1980 



/ 



ttgcaatcag tggaaccttc gcttatgtgg cccagcaggc ctggatcctc aatgctactc 2 040 

tgagagacaa catcctgttt gggaaggaat afcgatgaaga aagatacaac tctgtgctga 2100 

acagctgctg cctgaggcct gacctggcca ttcttcccag cagcgacctg acggagattg 2160 

gagagcgagg agccaacctg agcggtgggc agcgccagag gatcagcctt gcccgggcct 2220 

tgtatagtga caggagcatc tacatcctgg acgaccccct cagtgcctta gatgcccatg 2280 

tgggcaacca catcttcaat agtgctatcc ggaaacatct caagtccaag acagttctgt 2340 

ttgttaccca ccagttacag tacctggttg actgtgatga agtgatcttc atgaaagagg 2400 

gctgtattac ggaaagaggc acccatgagg aactgatgaa tttaaatggt gactatgcta 2460 

ccatttttaa taacctgttg ctgggagaga caccgccagt tgagatcaat tcaaaaaagg 2520 

aaaccagtgg ttcacagaag aagtcacaag acaagggtcc taaaacagga tcagtaaaga 2 580 

aggaaaaagc agtaaagcca gaggaagggc agcttgtgca gctggaagag aaagggcagg 2 640 

gttcagtgcc ctggtcagta tatggtgtct acatccaggc tgctgggggc cccttggcat 2700 

tcctggttat tatggccctt ttcatgctga atgtaggcag caccgccttc agcacctggt 27 60 

ggttgagtta ctggatcaag caaggaagcg ggaacaccac tgtgactcga gggaacgaga 282 0 

cctcggtgag tgacagcatg aaggacaatc ctcatatgca gtactatgcc agcatctacg 2880 

ccctctccat ggcagtcatg ctgatcctga aagccattcg aggagttgtc tttgtcaagg 2940 

gcacgctgcg agcttcctcc cggctgcatg acgagctttt ccgaaggatc cttcgaagcc 3000 

ctatgaagtt ttttgacacg acccccacag ggaggattct caacaggttt tccaaagaca 3060 

tggatgaagt tgacgtgcgg ctgccgttcc aggccgagat gttcatccag aacgttatcc 3120 

tggtgttctt ctgtgtggga atgatcgcag gagtcttccc gtggttcctt gtggcagtgg 3180 

ggccccttgt catcctcttt tcagtcctgc acattgtctc cagggtcctg attcgggagc 3240 

tgaagcgtct ggacaatatc acgcagtcac ctttcctctc ccacatcacg tccagcatac 3300 

agggccttgc caccatccac gcctacaata aagggcagga gtttctgcac agataccagg 33 60 

agctgctgga tgacaaccaa gctccttttt ttttgtttac gtgtgcgatg cggtggctgg 3420 

ctgtgcggct ggacctcatc agcatcgccc tcatcaccac cacggggctg atgatcgttc 3480 

ttatgcacgg gcagattccc ccagcctatg cgggtctcgc catctcttat gctgtccagt 3540 

taacggggct gttccagttt acggtcagac tggcatctga gacagaagct cgattcacct 3 60 0 

cggtggagag gatcaatcac tacattaaga ctctgtcctt ggaagcacct gccagaatta 3 660 

agaacaaggc tccctcccct gactggcccc aggagggaga ggtgaccttt gagaacgcag 372 0 

agatgaggta ccgagaaaac ctccctcttg tcctaaagaa agtatccttc acgatcaaac 3780 

ctaaagagaa gattggcatt gtggggcgga caggatcagg gaagtcctcg ctggggatgg 3 84 0 

ccctcttccg tctggtggag ttatctggag gctgcatcaa gattgatgga gtgagaatca 3900 

gtgatattgg ccttgccgac ctccgaagca aactctctat cattcctcaa gagccggtgc 3960 

tgttcagtgg cactgtcaga tcaaatttgg accccttcaa ccagtacact gaagaccaga 4020 

tttgggatgc cctggagagg acacacatga aagaatgtat tgctcagcta cctctgaaac 4080 

ttgaatctga agtgatggag aatggggata acttctcagt gggggaacgg cagctcttgt 4140 

gcatagctag agccctgctc cgccactgta agattctgat tttagatgaa gccacagctg 4200 

ccatggacac agagacagac ttattgattc aagagaccat ccgagaagca tttgcagact 42 60 

gtaccatgct gaccattgcc catcgcctgc acacggttct aggctccgat aggattatgg 4320 

tgctggccca gggacaggtg gtggagtttg acaccccatc ggtccttctg tccaacgaca 4380 

gttcccgatt ctatgccatg tttgctgctg cagagaacaa ggtcgctgtc aagggctgac 4440 

tcctccctgt tgacgaagtc tcttttcttt agagcattgc cattccctgc ctggggcggg 4500 

cccctcatcg cgtcctccta ccgaaacctt gcctttctcg attttatctt tcgcacagca 4560 

gttccggatt ggcttgtgtg tttcactttt agggagagtc atattttgat tattgtattt 4620 

attccatatt catgtaaaca aaatttagtt tttgttctta attgcactct aaaaggttca 4680 

gggaaccgtt attataattg tatcagaggc ctataatgaa gctttatacg tgtagctata 4740 

tctatatata attctgtaca tagcctatat ttacagtgaa aatgtaagct gtttatttta 4800 

tattaaaata agcactgtgc taataacagt gcatattcct ttctatcatt tttgtacagt 4860 

ttgctgtact agagatctgg ttttgctatt agactgtagg aagagtagca tttcattctt 4920 

ctctagctgg tggtttcacg gtgccaggtt ttctgggtgt ccaaaggaag acgtgtggca 4980 

atagtgggcc ctccgacagc cccctctgcc gcctccccac agccgctcca ggggtggctg 5040 

gagacgggtg ggcggctgga gaccatgcag agcgccgtga gttctcaggg ctcctgcctt 510 0 

ctgtcctggt gtcacttact gtttctgtca ggagagcagc ggggcgaagc ccaggcccct 5160 

tttcactccc tccatcaaga atggggatca cagagacatt cctccgagcc ggggagtttc 5220 

tttcctgcct tcttcttttt gctgttgttt ctaaacaaga atcagtctat ccacagagag 5280 

tcccactgcc tcaggttcct atggctggcc actgcacaga gctctccagc tccaagacct 5340 

gttggttcca agccctggag ccaactgctg ctttttgagg tggcactttt tcatttgcct 5400 

attcccacac ctccacagtt cagtggcagg gctcaggatt tcgtgggtct gttttccttt 5460 

ctcaccgcag tcgtcgcaca gtctctctct ctctctcccc tcaaagtctg caactttaag 5520 

cagctcttgc taatcagtgt ctcacactgg cgtagaagtt tttgtactgt aaagagacct 5580 

acctcaggtt gctggttgct gtgtggtttg gtgtgttccc gcaaaccccc tttgtgctgt 5640 

ggggctggta gctcaggtgg gcgtggtcac tgctgtcatc agttgaatgg tcagcgttgc 5700 

atgtcgtgac caactagaca ttctgtcgcc ttagcatgtt tgctgaacac cttgtggaag 5760 

caaaaatctg aaaatgtgaa taaaattatt ttggattttg taaaaaaaaa aaaaaaaaaa 582 0 

aaaaaaaaaa aaaaaaaa 583 8 
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Met 
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Met 


Ala 


Val 


Met 
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He 
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Val 


Lys 


Gly 
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Lys Asp Met Asp Glu Val Asp Val Arg Leu Pro Phe Gin Ala Glu Met 

980 985 990 

Phe lie Gin Asn Val He Leu Val Phe Phe Cys Val Gly Met He Ala 

995 1000 1005 

Gly Val Phe Pro Trp Phe Leu Val Ala Val Gly Pro Leu Val He Leu 

1010 1015 1020 

Phe Ser Val Leu His He Val Ser Arg Val Leu He Arg Glu Leu Lys 
1025 1030 1035 1040 

Arg Leu Asp Asn He Thr Gin Ser Pro Phe Leu Ser His He Thr Ser 

1045 1050 1055 

Ser He Gin Gly Leu Ala Thr He His Ala Tyr Asn Lys Gly Gin Glu 

1060 1065 1070 

Phe Leu His Arg Tyr Gin Glu Leu Leu Asp Asp Asn Gin Ala Pro Phe 

1075 1080 1085 

Phe Leu Phe Thr Cys Ala Met Arg Trp Leu Ala Val Arg Leu Asp Leu 

1090 1095 1100 

He Ser He Ala Leu He Thr Thr Thr Gly Leu Met He Val Leu Met 
1105 1110 1115 1120 

His Gly Gin He Pro Pro Ala Tyr Ala Gly Leu Ala He Ser Tyr Ala 

1125 1130 1135 

Val Gin Leu Thr Gly Leu Phe Gin Phe Thr Val Arg Leu Ala Ser Glu 

1140 1145 1150 

Thr Glu Ala Arg Phe Thr Ser Val Glu Arg He Asn His Tyr He Lys 

1155 1160 1165 

Thr Leu Ser Leu Glu Ala Pro Ala Arg He Lys Asn Lys Ala Pro Ser 

1170 1175 1180 

Pro Asp Trp Pro Gin Glu Gly Glu Val Thr Phe Glu Asn Ala Glu Met 
1185 1190 1195 1200 

Arg Tyr Arg Glu Asn Leu Pro Leu Val Leu Lys Lys Val Ser Phe Thr 

1205 1210 1215 

He Lys Pro Lys Glu Lys He Gly He Val Gly Arg Thr Gly Ser Gly 

1220 1225 1230 

Lys Ser Ser Leu Gly Met Ala Leu Phe Arg Leu Val Glu Leu Ser Gly 

1235 1240 1245 

Gly Cys He Lys He Asp Gly Val Arg He Ser Asp He Gly Leu Ala 

1250 1255 1260 

Asp Leu Arg Ser Lys Leu Ser He He Pro Gin Glu Pro Val Leu Phe 
1265 1270 1275 1280 

Ser Gly Thr Val Arg Ser Asn Leu Asp Pro Phe Asn Gin Tyr Thr Glu 

1285 1290 1295 

Asp Gin He Trp Asp Ala Leu Glu Arg Thr His Met Lys Glu Cys He 

1300 1305 1310 

Ala Gin Leu Pro Leu Lys Leu Glu Ser Glu Val Met Glu Asn Gly Asp 

1315 1320 1325 

Asn Phe Ser Val Gly Glu Arg Gin Leu Leu Cys He Ala Arg Ala Leu 

1330 1335 1340 

Leu Arg His Cys Lys He Leu He Leu Asp Glu Ala Thr Ala Ala Met 
1345 1350. 1355 1360 

Asp Thr Glu Thr Asp Leu Leu He Gin Glu Thr He Arg Glu Ala Phe 

1365 1370 1375 

Ala Asp Cys Thr Met Leu Thr He Ala His Arg Leu His Thr Val Leu 

1380 1385 1390 

Gly Ser Asp Arg He Met Val Leu Ala Gin Gly Gin Val Val Glu Phe 

1395 1400 1405 

Asp Thr Pro Ser Val Leu Leu Ser Asn Asp Ser Ser Arg Phe Tyr Ala 

1410 1415 1420 

Met Phe Ala Ala Ala Glu Asn Lys Val Ala Val Lys Gly 
1425 1430 1435 
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<400> 5 

ccccatggac gccctgtgcg gttccgggga gctcggctcc aagttctggg actccaacct 60 

gtctgtgcac acagaaaacc cggacctcac tccctgcttc cagaactccc tgctggcctg 120 

ggtgccctgc atctacctgt gggtcgccct gccctgctac ttgctctacc tgcggcacca 180 

ttgtcgtggc tacatcatcc tctcccacct gtccaagctc aagatggtcc tgggtgtcct 240 

gctgtggtgc gtctcctggg cggacctttt ttactccttc catggcctgg tccatggccg 300 

ggcccctgcc cctgttttct ttgtcacccc cttggtggtg ggggtcacca tgctgctggc 360 

caccctgctg atacagtatg agcggctgca gggcgtacag tcttcggggg tcctcattat 420 

cttctggttc ctgtgtgtgg tctgcgccat cgtcccattc cgctccaaga tccttttagc 480 

caaggcagag ggtgagatct cagacccctt ccgcttcacc accttctaca tccactttgc 540 

cctggtactc tctgccctca tcttggcctg cttcagggag aaacctccat ttttctccgc 600 

aaagaatgtc gaccctaacc cctaccctga gaccagcgct ggctttctct cccgcctgtt 660 

tttctggtgg ttcacaaaga tggccatcta tggctaccgg catcccctgg aggagaagga 72 0 

cctctggtcc ctaaaggaag aggacagatc ccagatggtg gtgcagcagc tgctggaggc 780 

atggaggaag caggaaaagc agacggcacg acacaaggct tcagcagcac ctgggaaaaa 84 0 

tgcctccggc gaggacgagg tgctgctggg tgcccggccc aggccccgga agccctcctt 900 

cctgaaggcc ctgctggcca ccttcggctc cagcttcctc atcagtgcct gcttcaagct 9 60 

tatccaggac ctgctctcct tcatcaatcc acagctgctc agcatcctga tcaggtttat 1020 

ctccaacccc atggccccct cctggtgggg cttcctggtg gctgggctga tgttcctgtg 1080 

ctccatgatg cagtcgctga tcttacaaca ctattaccac tacatctttg tgactggggt 1140 

gaagtttcgt actgggatca tgggtgtcat ctacaggaag gctctggtta tcaccaactc 120 0 

agtcaaacgt gcgtccactg tgggggaaat tgtcaacctc atgtcagtgg atgcccagcg 12 60 

cttcatggac cttgccccct tcctcaatct gctgtggtca gcacccctgc agatcatcct 1320 

ggcgatctac ttcctctggc agaacctagg tccctctgtc ctggctggag tcgctttcat 1380 

ggtcttgctg attccactca acggagctgt ggccgtgaag atgcgcgcct tccaggtaaa 1440 

gcaaatgaaa ttgaaggact cgcgcatcaa gctgatgagt gagatcctga acggcatcaa 1500 

ggtgctgaag ctgtacgcct gggagcccag cttcctgaag caggtggagg gcatcaggca 15 6 0 

cj'Cfg'tgagctc cagctgctgc gcacggcggc ctacctccac accacaacca ccttcacctg 1620 

gatgtgcagc cccttcctgg tgaccctgat caccctctgg gtgtacgtgt acgtggaccc 1680 

aaacaatgtg ctggacgccg agaaggcctt tgtgtctgtg tccttgttta atatcttaag 1740 

acttcccctc aacatgctgc cccagttaat cagcaacctg actcaggcca gtgtgtctct 1800 

gaaacggatc cagcaattcc tgagccaaga ggaacttgac ccccagagtg tggaaagaaa 1860 

gaccatctcc ccaggctatg ccatcaccat acacagtggc accttcacct gggcccagga 1920 

cctgcccccc actctgcaca gcctagacat ccaggtcccg aaaggggcac tggtggccgt 1980 

ggtggggcct gtgggctgtg ggaagtcctc cctggtgtct gccctgctgg gagagatgga 2040 

gaagctagaa ggcaaagtgc acatgaaggg ctccgtggcc tatgtgcccc agcaggcatg 2100 

gatccagaac tgcactcttc aggaaaacgt gcttttcggc aaagccctga accccaagcg 2160 

ctaccagcag actctggagg cctgtgcctt gctagctgac ctggagatgc tgcctggtgg 2220 

ggatcagaca gagattggag agaagggcat taacctgtct gggggccagc ggcagcgggt 2280 

cagtctggct cgagctgttt acagtgatgc cgatattttc ttgctggatg acccactgtc 2340 

cgcggtggac tctcatgtgg ccaagcacat ctttgaccac gtcatcgggc cagaaggcgt 2400 

gctggcaggc aagacgcgag tgctggtgac gcacggcatt agcttcctgc cccagacaga 2460 

cttcatcatt gtgctagctg atggacaggt gtctgagatg ggcccgtacc cagccctgct 2520 

gcagcgcaac ggctcctttg ccaactttct ctgcaactat gcccccgatg aggaccaagg 2580 

gcacctggag gacagctgga ccgcgttgga aggtgcagag gataaggagg cactgctgat 2 640 

tgaagacaca ctcagcaacc acacggatct gacagacaat gatccagtca cctatgtggt 2700 

ccagaagcag tttatgagac agctgagtgc cctgtcctca gatggggagg gacagggtcg 27 60 

gcctgtaccc cggaggcacc tgggtccatc agagaaggtg caggtgacag aggcgaaggc 2820 

agatggggca ctgacccagg aggagaaagc agccattggc actgtggagc tcagtgtgtt 2880 

ctgggattat gccaaggccg tggggctctg taccacgctg gccatctgtc tcctgtatgt 2940 

gggtcaaagt gcggctgcca ttggagccaa tgtgtggctc agtgcctgga caaatgatgc 3000 

catggcagac agtagacaga acaacacttc cctgaggctg ggcgtctatg ctgctttagg 30 60 

aattctgcaa gggttcttgg tgatgctggc agccatggcc atggcagcgg gtggcatcca 3120 

ggctgcccgt gtgttgcacc aggcactgct gcacaacaag atacgctcgc cacagtcctt 3180 

ctttgacacc acaccatcag gccgcatcct gaactgcttc tccaaggaca tctatgtcgt 3240 

tgatgaggtt ctggcccctg tcatcctcat gctgctcaat tccttcttca acgccatctc 3300 

cactcttgtg gtcatcatgg ccagcacgcc gctcttcact gtggtcatcc tgcccctggc 3360 

tgtgctctac accttagtgc agcgcttcta tgcagccaca tcacggcaac tgaagcggct 3420 

ggaatcagtc agccgctcac ctatctactc ccacttttcg gagacagtga ctggtgccag 3480 

tgtcatccgg gcctacaacc gcagccggga ttttgagatc atcagtgata ctaaggtgga 3540 

tgccaaccag agaagctgct acccc tacat catctccaac cggtggctga gcatcggagt 3 60 0 

ggagttcgtg gggaactgcg tggtgctctt tgctgcacta tttgccgtca tcgggaggag 3 660 

cagcctgaac ccggggctgg tgggcctttc tgtgtcctac tccttgcagg tgacatttgc 3720 

tctgaactgg atgatacgaa tgatgtcaga tttggaatct aacatcgtgg ctgtggagag 3780 

ggtcaaggag tactccaaga cagagacaga ggcgccctgg gtggtggaag gcagccgccc 3 840 



tcccgaaggt 
gccgggccta 
ggggatcgtg 
cctggaggcg 
ccatgacctg 
cctgcgcatg 
ggagctgtcc 
ctcagagggc 
cctgctccgc 
gactgacaac 
catcgcacac 
agtagtagct 
gatggccaga 
ctggttttca 
acactggggg 
atgctttaga 
t ttgagccag 
tgcactgttt 
attttctaaa 
cagctgctgg 
tccattaaaa 



tggcccccac 
gacctggtgc 
ggccgcactg 
gcaaagggtg 
cgctctcagc 
aacctggacc 
cacctgcaca 
ggggagaatc 
aagagccgca 
ctcatccagg 
cggcttaaca 
gaatttgatt 
gatgctggac 
tcaggaagga 
caccttaaga 
tgaggaaatg 
ttagactagt 
tcaaataacg 
gtttcgtttc 
gtcaggccac 
atgggagtac 



gtggggaggt 
tgagagacct 
gggctggcaa 
aaatccgcat 
tgaccatcat 
ccttcggcag 
cgtttgtgag 
tcagcgtggg 
tec tggtttt 
ctaccatccg 
ctatcatgga 
ctccagccaa 
ttgcctaaaa 
aatgacacca 
ttttgeaect 
atccccaagt 
ccccggtctc 
attttatgaa 
tgttttttaa 
ccctaggaac 
tgatgaaata 



ggagttccgg 
gagtctgeat 
gtcttccatg 
tgatggcc tc 
cccgcaggac 
ctactcagag 
ctcccagccg 
ccagaggcag 
agacgaggee 
cacccagttt 
ctacaccagg 
cctcattgca 
tatattcc tg 
aatatgtccg 
gtaaagtgcc 
ggtgaatgac 
ccgattccca 
atgacctc tg 
taaaaagctt 
tcagtcctgt 
aaactacag 



aattattc tg 
gtgcacggtg 
accct t tgee 
aatgtggcag 
cccatcctgt 
gaggacattt 
geaggectgg 
ctcgtgtgcc 
acagctgcca 
gatacctgea 
gtcctggtcc 
gctagaggca 
agatttcc tc 
cagaatggac 
ttacagggta 
aegectaagg 
ac tgagtgt t 
tcctccctct 
tttcctcctg 
actctggggt 



tgcgctaccg 
gcgagaaggt 
tgttccgcat 
acatcggcct 
teteggggae 
ggtgggcttt 
acttccagtg 
tggcccgagc 
tcgacctgga 
ctgtcctgac 
tggacaaagg 
tettctaegg 
ctggcctttc 
ttgatagcaa 
actgtgctga 
tcacagctag 
atttgeacac 
gatttttcat 
gaacagaaga 
getgectgaa 



3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5079 



<210> 6 

<211> 1527 

<212> PRT 

<213> Homo sapiens 



<400> 6 



Met 


Asp 


Ala 


Leu 


Cys 


Gly 


Ser 


Gly 


Glu 


Leu 


Gly 


Ser 


Lys 


Phe 


Trp 


Asp 


1 








5 










10 










15 




Ser 


Asn 


Leu 


Ser 


Val 


His 


Thr 


Glu 


Asn 


Pro 


Asp 


Leu 


Thr 


Pro 


Cys 


Phe 








20 










25 










30 






Gin 


Asn 


Ser 


Leu 


Leu 


Ala 


Trp 


Val 


Pro 


Cys 


He 


Tyr 


Leu 


Trp 


Val 


Ala 






35 










40 










45 








Leu 


Pro 


Cys 


Tyr 


Leu 


Leu 


Tyr 


Leu 


Arg 


His 


His 


Cys 


Arg 


Gly 


Tyr 


He 




50 










55 










60 










Tie 


Leu 


Ser 


His 


Leu 


Ser 


Lys 


Leu 


Lys 


Met 


Val 


Leu 


Gly Val 


Leu 


Leu 


65 










70 










75 










80 


Trp 


Cys 


Val 


Ser 


Trp 


Ala 


Asp 


Leu 


Phe 


Tyr 


Ser 


Phe 


His 


Gly 


Leu 


Val 










85 










90 










95 




His 


Gly 


Arg 


Ala 


Pro 


Ala 


Pro 


Val 


Phe 


Phe 


Val 


Thr 


Pro 


Leu 


Val 


Val 






100 










105 










110 






Gly 


Val 


Thr 


Met 


Leu 


Leu 


Ala 


Thr 


Leu 


Leu 


He 


Gin 


Tyr 


Glu 


Arg 


Leu 




115 










120 










125 








Gin 


Gly 


Val 


Gin 


Ser 


Ser 


Gly 


Val 


Leu 


He 


He 


Phe 


Trp 


Phe 


Leu 


Cys 




130 










135 










140 










Val 


Val 


Cys 


Ala 


He 


Val 


Pro 


Phe 


Arg 


Ser 


Lys 


He 


Leu 


Leu 


Ala 


Lys 


145 










150 










155 










160 


Ala 


Glu 


Gly 


Glu 


He 


Ser 


Asp 


Pro 


Phe 


Arg 


Phe 


Thr 


Thr 


Phe 


Tyr 


He 








165 










170 










175 




His 


Phe 


Ala 


Leu 


Val 


Leu 


Ser 


Ala 


Leu 


He 


Leu 


Ala 


Cys 


Phe 


Arg 


Glu 








180 










185 










190 






Lys 


Pro 


Pro 


Phe 


Phe 


Ser 


Ala 


Lys 


Asn 


Val 


Asp 


Pro 


Asn 


Pro 


Tyr 


Pro 




195 










200 










205 








Glu 


Thr 


Ser 


Val 


Gly 


Phe 


Leu 


Ser 


Arg 


Leu 


Phe 


Phe 


Trp 


Trp 


Phe 


Thr 




210 










215 










220 










Lys 


Met 


Ala 


He 


Tyr 


Gly 


Tyr 


Arg 


His 


Pro 


Leu 


Glu 


Glu 


Lys 


Asp 


Leu 


225 










230 










235 










240 


Trp 


Ser 


Leu 


Lys 


Glu 


Glu 


Asp 


Arg 


Ser 


Gin 


Met 


Val 


Val 


Gin 


Gin 


Leu 










245 










250 










255 




Leu 


Glu 


Ala 


Trp 


Arg 


Lys 


Gin 


Glu 


Lys 


Gin 


Thr 


Ala 


Arg 


His 


Lys 


Ala 








260 










265 










270 






Ser 


Ala 


Ala 


Pro 


Gly 


Lys 


Asn 


Ala 


Ser 


G iy 


Glu 


Asp 


Glu 


Val 


Leu 


Leu 






275 










280 










285 









Gly 


Ala 


Arg 


Pro 




290 






Ala 


Thr 


Phe 


Gly 


305 








Gin 


Asp 


Leu 


Leu 


Arg 


Phe 


He 


Ser 








340 


Ala 


Gly 


Leu 


Met 






3 55 




His 


TVr 




His 




370 






lie 


Met 


Gly 


Val 


385 








Lys 


Arg 


Ala 


Ser 


Ala 


Gin 


Arg 


Phe 








420 


Ala 


Pro 


Leu 


Gin 






435 




Gly 


Pro 


Ser 


Val 




450 






LeU 


Asn 


Gly 


Ala 


465 








Met 


Lys 


Leu 


Lys 


Gly 


He 


Lys 


Val 








500 


Gin 


Val 


Glu 


Gly 






515 




Ala 


Tvr 


Leu 


His 




530 






LeU 


Val 


Thr 


Leu 


545 








Asn 


Val 


Leu 


Asp 


lie 


Leu. 


Arg 


Leu 








580 


Thr 


Gin 


Ala 


Ser 






595 




Glu 


Glu 


Leu 


Asp 




610 






Tvr 


Ala 


He 


Thr 


625 








Pro 


Pro 


Thr 


Leu 


Val 


Ala 


Val 


Val 








660 


Ala 


Leu 


Leu 


Gly 






675 




Gly 


Ser 


Val 


Ala 




690 






Leu 


Gin 


Glu 


Asn 


705 








Gin 


Gin 


Thr 


Leu 


Pro 


Gly 


Gly 


Asp 








740 


Gly 


Gly 


Gin 


Arg 






755 




Ala 


Asp 


He 


Phe 




770 






Val 


Ala 


Lys 


His 



Arg 


71) -y- r-^, 




Lys 






2 9 5 




Ser 


Ser 


Phe 


Leu 




310 






Ser 


Phe 


He 


Asn 


325 








Asn 


Pro 


Met 


Ala 


Phe 


Leu 


Cys 


Ser 








3 60 


iyr 


He 


Phe 


Val 






375 




He 


iy x 


Arg 


ys 




390 






Thr 


Val 


Gly 


Glu 


405 








Met 


Asp 


Leu 


Ala 


He 


He 


Leu 


Ala 








440 


Leu 


Ala 


Gly 


Val 






455 




Val 


Ala 


Val 


Lys 




470 






Asp 


S er 




He 


485 








Leu 


Lys 


Leu 


Tvr 


He 


Arg 


Gin 


Gly 








520 


Thr 


Thr 


Thr 


Thr 






53 5 




He 


Thr 


Leu 






5 50 






Ala 


Glu 


Lys 


Ala 


565 








Pro 


L eu 


Asn 


Met 


Val 


Ser 


Leu 


Lys 








600 


Pro 


Gin 


Ser 


Val 






615 




He 


His 


Ser 


Gly 




630 






His 


Ser 


Leu 


Asp 


645 








Gly 


Pro 


Val 


Gly 


Glu 


Met 


Glu 


Lys 








680 


i Y I 


Val 


Pro 


Gin 






695 




Val 


Leu 


Phe 


Gly 




710 






pin 


Ala 


Cys 




725 








Gin 


Thr 


Glu 


He 


Gin 


Arg 


Val 


Ser 








760 


Leu 


Leu 


Asp 


Asp 






775 




He 


Phe 


Asp 


His 




790 







Pro 


Ser 


Phe 


Leu 








300 


He 


Ser 


Ala 


Cys 






315 




Pro 


Gin 


Leu 


Leu 




330 






Pro 


Ser 


Trp 


Trp 


345 








Met 


Met 


Gin 


Ser 


Thr 


Gly 


Val 


Lys 








380 


Ala 


Leu 


Val 


lie 






395 




He 


Val 


Asn 


Leu 




410 






Pro 


Phe 


Leu 


Asn 


425 








He 


Tvr 


Phe 


Leu 


Ala 


Phe 


Met 


Val 








460 


Met 


Arg 


Ala 


Phe 






475 




Lys 


Leu 


Met 


Ser 




490 






Ala 


Trx> 


Glu 


Pro 


505 








Glu 


Leu 


Gin 


Leu 


Phe 


Thr 


Trp 


Met 








540 


Val 


Tvr 


Val 


Tvr 






555 




Phe 


Val 


Ser 


Val 




570 






Leu 


Pro 


Gin 


Leu 


585 








Arg 


He 


Gin 


Gin 


Glu 


Arg 


Lys 


Thr 








620 


Thr 


Phe 


Thr 


Trp 






635 




He 


Gin 


Val 


Pro 




650 






Cys 


Gly 


Lys 


Ser 


665 








Leu 


Glu 


Gly 


Lys 


Gin 


Ala 


Trp 


He 








700 


Lys 


Ala 


Leu 


Asn 






715 




Leu 


L eu 


Ala 


Asp 




730 






Gly 


Glu 


Lys 


Gly 


745 








Leu 


Ala 


Arg 


Ala 


Pro 


Leu 


Ser 


Ala 








780 


Val 


He 


Gly 


Pro 






795 





Lys Ala Leu Leu 

Phe Lys Leu He 
320 

Ser He Leu He 
335 

Gly Phe Leu Val 
350 

Leu He Leu Gin 

365 

Phe Arg Thr Gly 

Thr Asn Ser Val 
400 

Met Ser Val Asp 
415 

Leu Leu Trp Ser 
430 

Trp Gin Asn Leu 
445 

Leu Leu He Pro 

Gin Val Lys Gin 
480 

Glu He Leu Asn 
495 

Ser Phe Leu Lys 
510 

Leu Arg Thr Ala 

525 

Cys Ser Pro Phe 

Val Asp Pro Asn 
560 

Ser Leu Phe Asn 
575 

He Ser Asn Leu 
590 

Phe Leu Ser Gin 
605 

He Ser Pro Gly 

Ala Gin Asp Leu 
640 

Lys Gly Ala Leu 
655 

Ser Leu Val Ser 
670 

Val His Met Lys 
685 

Gin Asn Cys Thr 

Pro Lys Arg Tyr 
720 

Leu Glu Met Leu 
735 

He Asn Leu Ser 
750 

Val Tyr Ser Asp 
765 

Val Asp Ser His 

Glu Gly Val Leu 
800 



Ala 


Glv 


Lys 


Thr 


Arg 
805 


Val 


Leu 


Val 


Thr 


His 
810 


Gly 


He 


Ser 


Phe 


Leu 
815 


Pro 


Gin 


Thr 


Asp 


Phe 
820 


He 


He 


Val 


Leu 


Ala 
825 


Asp 


Gly 


Gin 


Val 


Ser 
830 


Glu 


Met 


Gly 


Pro 


Tvr 
835 


Pro 


Ala 


Leu 


Leu 


Gin 
840 


Arg 


Asn 


Glv 


Ser 


Phe 
845 


Ala 


Asn 


Phe 


Leu. 


Cys 
850 


Asn 


Tvr 


Ala 


Pro 


Asp 
855 


Glu 


Asp 


Gin 


Glv 


His 
860 


Leu 


Glu 


Asp 


Ser 


1 - L ir J 


Thr 


Ala 


Leu 


Glu 


Gly 


Ala 


Glu 


Asp 


Lys 


Glu 


Ala 


Leu 


Leu 


He 


Glu 


865 










870 










875 










880 


Asp 


Thr 


Leu 


Ser 


Asn 
885 


His 


Thr 


Asp 


Leu 


Thr 

890 


Asp 


Asn 


Asp 


Pro 


Val 

895 


Thr 


Tvr 


Val 


Val 


Gin 
900 


Lys 


Gin 


Phe 


Met 


Arg 
905 


Gin 


Leu 


Ser 


Ala 


Leu 
910 


Ser 


Ser 


Asp 


Gly 


Glu 
915 


Gly 


Gin 


Glv 


Arg 


Pro 
920 


Val 


Pro 


Arg 


Arg 


His 

925 


Leu 


Gly 


Pro 


Ser 


Glu 


Lys 


Val 


Gin 


Val 


Thr 


Glu 


Ala 


Lys 


Ala 


Asp 


Gly Ala 


Leu 


Thr 




930 










935 










940 










Gin 


Glu 


Glu 


Lys 


Ala 


Ala 


He 


Gly 


Thr 


Val 


Glu 


Leu 


Ser 


Val 


Phe 


Trp 


945 










950 










955 










960 


Asp 


Tyr 


Ala 


Lys 


Ala 
9 65 


Val 


Gly 


Leu 


Cys 


Thr 
970 


Thr 


Leu 


Ala 


He 


Cys 
975 


Leu 


Leu 


Tyr 


Val 


Gly 


Gin 


Ser 


Ala 


Ala 


Ala 


He 


Gly 


Ala 


Asn 


Val 


Trp 


Leu 






980 










985 










990 






Ser 


Ala 


Trp 


Thr 


Asn 


Asp 


Ala 


Met 


Ala 


Asp 


Ser 


Arg 


Gin 


Asn 


Asn 


Thr 






995 










1000 








1005 






Ser 


Leu 


Arg 


Leu 


Gly 


Val 


Tyr 


Ala 


Ala 


Leu 


Gly 


He 


Leu 


Gin 


Gly 


Phe 




1010 








1015 








1020 








Leu 


Val 


Met 


Leu 


Ala 


Ala 


Met 


Ala 


Met 


Ala 


Ala 


Gly 


Gly 


He 


Gin 


Ala 


1025 








1030 








1035 








1040 


Ala 


Arg 


Val 


Leu 


His 


Gin 


Ala 


Leu 


Leu 


His 


Asn 


Lys 


He 


Arg 


Ser 


Pro 








1045 








1050 








1055 


Gin 


Ser 


Phe 


Phe 


Asp 


Thr 


Thr 


Pro 


Ser 


Gly 


Arg 


He 


Leu 


Asn 


Cys 


Phe 








1060 








1065 








1070 




Ser 


Lys 


Asp 


He 


Tyr 


Val 


Val 


Asp 


Glu 


Val 


Leu 


Ala 


Pro 


Val 


He 


Leu 






1075 








1080 








1085 






Met 


Leu 


Leu 


Asn 


Ser 


Phe 


Phe 


Asn 


Ala 


He 


Ser 


Thr 


Leu 


Val 


Val 


He 




1090 








1095 








1100 








Met 


Ala 


Ser 


Thr 


Pro 


Leu 


Phe 


Thr 


Val 


Val 


He 


Leu 


Pro 


Leu 


Ala 


Val 


1105 








1110 








1115 








1120 


Leu 


Tyr 


Thr 


Leu 


Val 


Gin 


Arg 


Phe 


Tyr 


Ala 


Ala 


Thr 


Ser 


Arg 


Gin 


Leu 








1125 








1130 








1135 


Lys 


Arg 


Leu 


Glu 


Ser 


Val 


Ser 


Arg 


Ser 


Pro 


He 


Tyr 


Ser 


His 


Phe 


Ser 








1140 








1145 








1150 




Glu 


Thr 


Val 


Thr 


Gly 


Ala 


Ser 


Val 


He 


Arg 


Ala 


Tyr 


Asn 


Arg 


Ser 


Arg 






1155 








1160 








1165 






Asp 


Phe 


Glu 


He 


He 


Ser 


Asp 


Thr 


Lys 


Val 


Asp 


Ala 


Asn 


Gin 


Arg 


Ser 




1170 








1175 








1180 








Cys 


Tyr 


Pro 


Tyr 


He 


He 


Ser 


Asn 


Arg 


Trp 


Leu 


Ser 


He 


Gly 


Val 


Glu 


118! 










1190 








1195 








1200 


Phe 


Val 


Gly 


Asn 


Cys 


Val 


Val 


Leu 


Phe 


Ala 


Ala 


Leu 


Phe 


Ala 


Val 


He 








1205 








1210 








1215 


Gly Arg 


Ser 


Ser 


Leu 


Asn 


Pro 


Gly 


Leu 


Val 


Gly 


Leu 


Ser 


Val 


Ser 


Tyr 








1220 








1225 








1230 




Ser 


Leu 


Gin 


Val 


Thr 


Phe 


Ala 


Leu 


Asn 


Trp 


Met 


He 


Arg 


Met 


Met 


Ser 






1235 








1240 








1245 






Asp 


Leu 


Glu 


Ser 


Asn 


He 


Val 


Ala 


Val 


Glu 


Arg 


Val 


Lys 


Glu 


Tyr 


Ser 


1250 








1255 








1260 








Lys 


Thr 


Glu 


Thr 


Glu 


Ala 


Pro 


Trp 


Val 


Val 


Glu 


Gly 


Ser 


Arg 


Pro 


Pro 


1265 








1270 








1275 








1280 


Glu 


Gly 


Trp 


Pro 


Pro 


Arg 


Gly 


Glu 


Val 


Glu 


Phe 


Arg 


Asn 


Tyr 


Ser 


Val 










1285 








1290 








1295 


Arg 


Tyr 


Arg 


Pro 


Gly 


Leu 


Asp 


Leu 


Val 


Leu 


Arg 


Asp 


Leu 


Ser 


Leu 


His 








1300 








1305 








1310 
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Val His Gly Gly Glu Lys Val Gly He Val Gly Arg Thr Gly Ala Gly 

1315 13,20 1325 

Lys Ser Ser Met Thr Leu Cys Leu. Phe Arg He Leu Glu Ala Ala Lys 

1330 1335 1340 

Gly Glu He Arg He Asp Gly Leu Asn Val Ala Asp He Gly Leu His 
1345 1350 1355 1360 

Asp Leu Arg Ser Gin Leu Thr He He Pro Gin Asp Pro He Leu Phe 

1365 1370 1375 

Ser Gly Thr Leu Arg Met Asn Leu Asp Pro Phe Gly Ser Tyr Ser Glu 

1380 1385 1390 

Glu Asp He Trp Trp Ala Leu Glu Leu Ser His Leu His Thr Phe Val 

1395 1400 1405 

Ser Ser Gin Pro Ala Gly Leu Asp Phe Gin Cys Ser Glu Gly Gly Glu 

1410 1415 1420 

Asn Leu Ser Val Gly Gin Arg Gin Leu Val Cys Leu Ala Arg Ala Leu 
1425 1430 1435 1440 

Leu Arg Lys Ser Arg He Leu Val Leu Asp Glu Ala Thr Ala Ala He 

1445 1450 1455 

Asp Leu Glu Thr Asp Asn Leu He Gin Ala Thr He Arg Thr Gin Phe 

1460 1465 1470 

Asp Thr Cys Thr Val Leu Thr He Ala His Arg Leu Asn Thr He Met 

1475 1480 1485 

Asp Tyr Thr Arg Val Leu Val Leu Asp Lys Gly Val Val Ala Glu Phe 

1490 1495 1500 

Asp Ser Pro Ala Asn Leu He Ala Ala Arg Gly He Phe Tyr Gly Met 
1505 1510 1515 1520 

Ala Arg Asp Ala Gly Leu Ala 
1525 
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<213> Homo sapiens 



<400> 7 

atggccgcgc ctgctgagcc ctgcgcgggg cagggggtct ggaaccagac agagcctgaa 60 

cctgccgcca ccagcctgct gagcctgtgc ttcctgagaa cagcaggggt ctgggtaccc 120 

cccatgtacc tctgggtcct tggtcccatc tacctcctct tcatccacca ccatggccgg 180 

ggctacctcc ggatgtcccc actcttcaaa gccaagatgg tgcttggatt cgccctcata 240 

gtcctgtgta cctccagcgt ggctgtcgct ctttggaaaa tccaacaggg aacgcctgag 300 

gccccagaat tcctcattca tcctactgtg tggctcacca cgatgagctt cgcagtgttc 3 60 

ctgattcaca ccgagaggaa aaagggagtc cagtcatctg gagtgctgtt tggttactgg 420 

cttctctgct ttgtcttgcc agctaccaac gctgcccagc aggcctccgg agcgggcttc 480 

cagagcgacc ctgtccgcca cctgtccacc tacctatgcc tgtctctggt ggtggcacag 540 

tttgtgctgt cctgcctggc ggatcaaccc cccttcttcc ctgaagaccc ccagcagtct 600 

aacccctgtc cagagactgg ggcagccttc ccctccaaag ccacgttctg gtgggtttct 660 

ggcctggtct ggaggggata caggaggcca ctgagaccaa aagacctctg gtcgcttggg 720 

agagaaaact cctcagaaga acttgtttcc cggcttgaaa aggagtggat gaggaaccgc 7 80 

agtgcagccc ggaggcacaa caaggcaata gcatttaaaa ggaaaggcgg cagtggcatg 840 

aaggctccag agaccgagcc cttcctacgg caagaaggga gccagtggcg cccactgctg 900 

aaggccatct ggcaggtgtt ccattctacc ttcctcctgg ggaccctcag cctcatcatc 960 

agtgatgtct tcaggttcac tgtccccaag ctgctcagcc ttttcctgga gtttattggt 1020 

gatcccaagc ctccagcctg gaagggctac ctcctcgccg tgctgatgtt cctctcagcc 1080 

tgcctgcaaa cgctgtttga gcagcagaac atgtacaggc tcaaggtgcc gcagatgagg 1140 

ttgcggtcgg ccatcactgg cctggtgtac agaaaggtcc tggctctgtc cagcggctcc 1200 

agaaaggcca gtgcggtggg tgatgtggtc aatctggtgt ccgtggacgt gcagcggctg 12 60 

accgagagcg tcctctacct caacgggctg tggctgcctc tcgtctggat cgtggtctgc 1320 

ttcgtctatc tctggcagct cctggggccc tccgccctca ctgccatcgc tgtcttcctg 1380 

agcctcctcc ctctgaattt cttcatctcc aagaaaagga accaccatca ggaggagcaa 1440 

atgaggcaga aggactcacg ggcacggctc accagctcta tcctcaggaa ctcgaagacc 1500 

atcaagttcc atggctggga gggagccttt ctggacagag tcctgggcat ccgaggccag 15 60 

gagctgggcg ccttgcggac ctccggcctc ctcttctctg tgtcgctggt gtccttccaa 1620 

gtgtctacat ttctggtcgc actggtggtg tttgctgtcc acactctggt ggccgagaat 1680 

gctatgaatg cagagaaagc ctttgtgact ctcacagttc tcaacatcct caacaaggcc 1740 



O J *B H\7 j. H 13 SJ ^ii nJI : 7 O O 



caggctttcc 
ctggtcacct 
ggaagcgctg 
gaaagccctc 
gttgtcggtc 
tcaaaggtgg 
tgggtgcaga 
tggctggaga 
ggaatccaca 
ctgagcctgg 
gcggccctgg 
ctac tccagg 
gattggatca 
ctgcagagga 
ggagaaggag 
aggaggcccg 
acttcagaag 
ggaaaggaca 
gccgtgggca 
tccttctgcc 
cagacgcagg 
gggctgtttg 
ttccagaggc 
attggtcacc 
gacaaactcc 
gcagtggcta 
tttcagagcc 
tcgtctgtct 
cgaacccagg 
agtt tcccgc 
ggcc tggtgt 
ctcgtgggct 
cgcaactgga 
tggacgccca 
cagggcgggc 
gctgtgcagg 
accggggcag 
ggtgggatct 
aggatcagca 
gacctgctgc 
aaagccttgg 
gacctgagcg 
cagatcctca 
caggccatgc 
cgctccgtga 
ggcagcccgg 
ggcctggtc 



tgcccttctc 
tcctctgcct 
ccgggaagga 
cctgcctcca 
cagtgggggc 
aggggttcgt 
acacctctgt 
gagtactaga 
cttcaattgg 
cccgggctgt 
atgcccacgt 
gaacaacacg 
tagtgctggc 
agggggccct 
aaacagaacc 
agcttagacg 
cccagacaga 
gcatccaata 
cccccctctg 

ggggctactg 

cagccctgcg 
cctccatggc 
tcctgtggga 
tgctaaaccg 
ggtccctgct 
ccccactggc 
tgtatgtggt 
gctcccacat 
ccccc t t tgt 
gactggtggc 
ttgcagccgc 
tctctgtc tc 
cagacctaga 
aggaggctcc 
agatcgagtt 
gcgtgtcc tt 
ggaagtcc tc 
ggatcgacgg 
tcatccccca 
aggagcac tc 
tggccagcct 
tgggccagaa 
tcctggacga 
tcgggagctg 
tggactgtgc 
cccagctgct 



catccactcc 
ggaagaagtt 
ttgcatcacc 
cagaataaac 
agggaagtcc 
gagcatcgag 
ggtagagaat 
agcctgtgcc 
ggagcagggc 
atacagaaag 
tggccagcat 
gattctcgtg 
aaatggggcc 
cgtgtgtctt 
tgggaccagc 
cgagaggt cc 
ggttcctc tg 
cggcagggtg 
cctc tacgca 
gctgagcc tg 
tggcgggatc 
tgcggtgctc 
tgtggtgcga 
cttctccaag 
gatgtacgcc 
cactgtggcc 
tagctcatgc 
ggctgagacg 
ggc tcagaac 
tgacaggtgg 
cacgtgtgct 
tgctgccc tc 
gaacagcatc 
ctggaggc tg 
ccgggact tt 
caagatccac 
cctggccagt 
ggtccccatt 
ggaccccatc 
ggacgaggct 
gcccggccag 
acagctcc tg 
ggctactgct 
gtttgcacag 
ccgggttctg 
ggcccagaag 



ctcgtccagg 
gaccctggtg 
atacacagtg 
ctcacggtgc 
tccc tgctgt 
ggtgctgtgg 
gtgtgcttcg 
ctgcagccag 
atgaatctct 
gcagctgtgt 
gtcttcaacc 
acgcacgcac 
atcgcagaga 
ctggatcaag 
accaaggacc 
atcaagtcag 
gatgaccctg 
aaggccacag 
ctct tcctct 
tgggcggacg 
ttcgggctcc 
ctaggtgggg 
tctcccatca 
gagacagaca 
tttggactcc 
atcc tgccac 
cage tgagac 
ttccagggca 
aatgctcgcg 
cttgcggcca 
gtgetgagea 
caggtgaccc 
gtgtcagtgg 
cccacatgtg 
gggc taagat 
gcaggagaga 
gggc tgctgc 
gcccacgtgg 
ctgt tccctg 
atctgggcag 
ctgeagtaca 
tgtc tggcac 
gccgtggacc 
tgcactgtgc 
gtcatggaca 
ggcctgtt tt 



cccgggtgtc 
tegtagaetc 
ccaccttcgc 
cccagggctg 
ccgccctcct 
cctacgtgcc 
ggcaggagct 
atgtggacag 
ccggaggcca 
acctgctgga 
aggtcattgg 
tccacatcct 
tgggttccta 
ccagacagcc 
ccagaggcac 
tccctgagaa 
acagggcagg 
tgcacctggc 
tcctctgcca 
accctgcagt 
teggctgtet 
cccgggcatc 
gcttctttga 
eggt tgacgt 
tggaggtcag 
tgtttctcct 
gcttggagtc 
gcacagtggt 
tagatgaaag 
atgtggagct 
aagcccacct 
agacactgea 
ageggatgea 
cagctcagcc 
gccgacctga 
aggtgggcat 
ggctccagga 
ggctgcacac 
gctctctgcg 
ccctggagac 
agtgtgctga 
gtgcccttct 
ctggcacgga 
tgcccattgc 
aggggcaggt 
acagactggc 



etttgacegt 
aagttcctct 
c tggtcccag 
tetgetgget 
tggggagc tg 
ccaggaggcc 
ggacccaccc 
c t tccctgag 
gaagcagegg 
tgaccccc tg 
gcctggtggg 
gccccaggct 
ccaggagc tt 
aggagataga 
ctctgcaggc 
ggaccgtacc 
atggccagca 
ctacc tgcgt 
gcaagtggcc 
aggtgggcag 
ccaagccatt 
caggttgetc 
gcggacaccc 
ggacattcca 
cctggtggtg 
c tacgctggg 
age cage tac 
ccgggcattc 
ccagaggatc 
cctggggaat 
cagtgctggc 
gtgggttgtt 
ggactatgee 
cccctggcct 
gctcccgctg 
cgttggcagg 
ggcagctgag 
actgcgctcc 
gatgaacctc 
ggtgcagctc 
ccgaggegag 
ccggaagacc 
getgeagatg 
ccaccgcctg 
ggcagagagc 
ccaggagtca 



1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4509 
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8 
























Met 


Ala Ala 


Pro Ala 


Glu 


Pro 


Cys 


Ala 


Gly 


Gin 


Gly 


Val 


Trp 


Asn 


Gin 


1 




5 










10 










15 




Thr 


Glu Pro 


Glu Pro 


Ala 


Ala 


Thr 


Ser 


Leu 


Leu 


Ser 


Leu 


Cys 


Phe 


Leu 






20 








25 










30 






Arg 


Thr Ala 


Gly Val 


Trp 


Val 


Pro 


Pro 


Met 


Tyr 


Leu 


Trp 


Val 


Leu 


Gly 




35 








40 










45 








Pro 


lie Tyr 


Leu Leu 


Phe 


lie 


His 


His 


His 


Gly 


Arg 


Gly 


Tyr 


Leu 


Arg 




50 






55 










60 










Met 


Ser Pro 


Leu Phe 


Lys 


Ala 


Lys 


Met 


Val 


Leu 


Gly 


Phe 


Ala 


Leu 


lie 


65 






70 










75 










80 
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Val 


Leu. 


Cys 


Thr 


Ser 
85 


Ser 


Val 


Ala 


Gly 


Thr 


Pro 


Glu 


Ala 


Pro 


Glu 


Phe 








100 










Thr 


Thr 


Met 


Ser 


Phe 


Ala 


Val 


Phe 






115 










120 


Gly 


Val 


Gin 


Ser 


Ser 


Gly 


Val 


Leu 




130 










135 




Va.1 


Leu 


Pro 


Ala 


Thr 


Asn 


Ala 


Ala 


145 










150 






Gin 


Ser 


Asp 


Pro 


Va.1 


Arg 


His 


Leu 










165 








Val 


Val 


Ala 


Gin 


Phe 


Val 


Leu 


Ser 








180 










Phe 


Pro 


Glu 


Asp 


Pro 


Gin 


Gin 


Ser 






195 










200 


Ala. 


Phe 


Pro 


Ser 


Lys 


Ala 


Thr 


Phe 




210 










215 




Arg 


Gly 


TVr 


Arg 


Arg 


Pro 


Leu 


Arg 


225 










230 






Arg 


Glu 


Asn 


Ser 


Ser 


Glu 


Glu 


Leu 










245 








Met 


Arg 


Asn 


Arg 


Ser 


Ala 


Ala 


Arg 








260 










Lys 


Arg 


Lys 


Gly Gly 


Ser 


Gly 


Met 






275 










280 


Leu 


Arg 


Gin 


Glu 


Gly 


Ser 


Gin 


Trp 




290 










295 




Gin 


Val 


Phe 


His 


Ser 


Thr 


Phe 


Leu 


305 










310 






Ser 


Asp 


Val 


Phe 


Arg 


Phe 


Thr 


Val 










325 








Glu 


Phe 


lie 


Gly 


Asp 


Pro 


Lys 


Pro 








340 










Ala 


Val 


Leu 


Met 


Phe 


Leu 


Ser 


Ala 






355 










360 


Gin 


Asn 


Met 


Tyr 


Arg 


Leu 


Lys 


Val 




370 










375 




lie 


Thr 


Glv 


Leu 


Val 


TVr 


Arg 


Lys 


385 










390 






Arg 


Lys 


Ala 


Ser 


Ala 


Val 


Glv 


Asp 










405 








Val 


Gin 


Arg 


Leu 


Thr 


Glu 


Ser 


Val 








420 










Pro 


Leu. 


Val 


Trp 


lie 


Val 


Val 


Cys 






435 










440 


Gly 


Pro 


Ser 


Ala 


Leu 


Thr 


Ala 


He 




450 










455 




Leu. 


Asn 


Phe 


Phe 


lie 


Ser 


Lys 


Lys 


4 65 










470 






Met 


Arg 


Gin 


Lys 


Asp 


Ser 


Arg 


Ala 










485 








Asn 


S er 


Lys 


Thr 


He 


Lys 


Phe 


His 








500 










Arg 


Val 


Leu 


Gly 


He 


Arg 


Gly 


Gin 






515 










520 


Gly 


Leu 


Leu 


Phe 


Ser 


Val 


Ser 


Leu 




530 










535 




Leu 


Val 


Ala 


Leu 


Val 


Val 


Phe 


Ala 


545 










550 






Ala 


Met 


Asn 


Ala 


Glu 


Lys 


Ala 


Phe 










565 








Leu 


Asn 


Lys 


Ala 


Gin 


Ala 


Phe 


Leu 



580 



Val 


Ala 


Leu 


Trp 


Lys 


He 


Gin 


Gin 




90 










95 




Leu 


He 


His 


Pro 


Thr 


Val 


Trp 


Leu 


105 










110 






Leu 


He 


His 


Thr 


Glu 


Arg 


Lys 


Lys 










125 








Phe 


Gly 


Tvr 


Trp 


Leu 


Leu 


Cvs 


Phe 








140 










Gin 


Gin 


Ala 


Ser 


Gly 

J: 


Ala 


Gly 


Phe 






155 










160 


Ser 


Thr 


Tvr 

JL 


Leu 


Cys 


Leu 


Ser 


Leu 




170 










175 




Cys 


Leu 


Ala 


Asp 


Gin 


Pro 


Pro 


Phe 


185 










190 






Asn 


Pro 


Cys 


Pro 


Glu 


Thr 


Glv 


Ala 










205 








Trp 


Trp 


Val 


Ser 


Gly 


Leu 


Val 


Trp 








220 










Pro 


Lys 


Asp 


Leu 


Trp 


Ser 


Leu 


Gly 






235 










240 


Val 


Ser 


Arg 


Leu 


Glu 


Lys 


Glu 


Trp 




250 










255 




Arg 


His 


Asn 


Lys 


Ala 


He 


Ala 


Phe 


265 










270 






Lys 


Ala 


Pro 


Glu 


Thr 


Glu 


Pro 


Phe 










285 








Arg 


Pro 


Leu 


Leu 


Lys 


Ala 


He 


Trp 








300 










Leu 


Glv 


Thr 


Leu 


Ser 


Leu 


He 


He 






315 










320 


Pro 


Lys 


Leu 


Leu 


Ser 


Leu 


Phe 


Leu 




330 










335 




Pro 


Ala 


Trp 


Lvs 


Gly 


Tyr 


Leu 


Leu 


345 










350 






Cvs 


Leu 


Gin 


Thr 


Leu 


Phe 


Glu 


Gin 








365 








Pro 


Gin 


Met 


Arg 


Leu 


Arg 


Ser 


Ala 








380 










Val 


Leu 


Ala 


Leu 


Ser 


Ser 


Gly 


Ser 






395 










400 


Val 


Val 


Asn 


Leu 


Val 


Ser 


Val 


Asp 




410 










415 




Leu 


Tvr 


Leu 


Asn 


Gly 


Leu 


Trp 


Leu 


425 










430 






Phe 


Val 


Tvr 


Leu 


Trp 


Gin 


Leu 


Leu 










445 








Ala 


Val 


Phe 


Leu 


Ser 


Leu 


Leu 


Pro 








460 










Arg 


Asn 


His 


His 


Gin 


Glu 


Glu 


Gin 






475 










480 


Arg 


Leu 


Thr 


Ser 


Ser 


He 


Leu 


Arg 




490 










495 




Glv 


Trx> 


Glu 


Glv 


Ala 


Phe 


Leu 


Asp 


505 










510 






Glu 


Leu 


Gly 


Ala 


Leu 


Arg 


Thr 


Ser 










525 








Val 


Ser 


Phe 


Gin 


Val 


Ser 


Thr 


Phe 








540 










Val 


His 


Thr 


Leu 


Val 


Ala 


Glu 


Asn 






555 










560 


Val 


Thr 


Leu 


Thr 


Val 


Leu 


Asn 


He 




570 










575 




Pro 


Phe 


Ser 


He 


His 


Ser 


Leu 


Val 


585 










590 
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Gin Ala Arg Val Ser Phe Asp Arg Leu Val Thr Phe Leu Cys Leu Glu 

595 600 605 

Glu Val Asp Pro Gly Val Val Asp Ser Ser Ser Ser Gly Ser Ala Ala 

610 615 620 

Gly Lys Asp Cys lie Thr He His Ser Ala Thr Phe Ala Trp Ser Gin 
625 630 635 640 

Glu Ser Pro Pro Cys Leu His Arg He Asn Leu Thr Val Pro Gin Gly 

645 650 655 

Cys Leu Leu Ala Val Val Gly Pro Val Gly Ala Gly Lys Ser Ser Leu 

660 665 670 

Leu Ser Ala Leu Leu Gly Glu Leu Ser Lys Val Glu Gly Phe Val Ser 

675 680 685 

He Glu Gly Ala Val Ala Tyr Val Pro Gin Glu Ala Trp Val Gin Asn 

690 695 700 

Thr Ser Val Val Glu Asn Val Cys Phe Gly Gin Glu Leu Asp Pro Pro 
705 710 715 720 

Trp Leu Glu Arg Val Leu Glu Ala Cys Ala Leu Gin Pro Asp Val Asp 

725 730 735 

Ser Phe Pro Glu Gly He His Thr Ser He Gly Glu Gin Gly Met Asn 

740 745 750 

Leu Ser Gly Gly Gin Lys Gin Arg Leu Ser Leu Ala Arg Ala Val Tyr 

755 760 765 

Arg Lys Ala Ala Val Tyr Leu Leu Asp Asp Pro Leu Ala Ala Leu Asp 

770 775 780 

Ala His Val Gly Gin His Val Phe Asn Gin Val He Gly Pro Gly Gly 
785 790 795 800 

Leu Leu Gin Gly Thr Thr Arg He Leu Val Thr His Ala Leu His He 

805 810 815 

Leu Pro Gin Ala Asp Trp He He Val Leu Ala Asn Gly Ala He Ala 

820 825 830 

Glu Met Gly Ser Tyr Gin Glu Leu Leu Gin Arg Lys Gly Ala Leu Val 

835 840 845 

Cys Leu Leu Asp Gin Ala Arg Gin Pro Gly Asp Arg Gly Glu Gly Glu 

850 855 860 

Thr Glu Pro Gly Thr Ser Thr Lys Asp Pro Arg Gly Thr Ser Ala Gly 
865 870 875 880 

Arg Arg Pro Glu Leu Arg Arg Glu Arg Ser He Lys Ser Val Pro Glu 

885 890 895 

Lys Asp Arg Thr Thr Ser Glu Ala Gin Thr Glu Val Pro Leu Asp Asp 

900 905 910 

Pro Asp Arg Ala Gly Trp Pro Ala Gly Lys Asp Ser He Gin Tyr Gly 

915 920 925 

Arg Val Lys Ala Thr Val His Leu Ala Tyr Leu Arg Ala Val Gly Thr 

930 935 940 

Pro Leu Cys Leu Tyr Ala Leu Phe Leu Phe Leu Cys Gin Gin Val Ala 
945 950 955 960 

Ser Phe Cys Arg Gly Tyr Trp Leu Ser Leu Trp Ala Asp Asp Pro Ala 

965 970 975 

Val Gly Gly Gin Gin Thr Gin Ala Ala Leu Arg Gly Gly He Phe Gly 

980 985 990 

Leu Leu Gly Cys Leu Gin Ala He Gly Leu Phe Ala Ser Met Ala Ala 

995 1000 1005 

Val Leu Leu Gly Gly Ala Arg Ala Ser Arg Leu Leu Phe Gin Arg Leu 

1010 1015 1020 

Leu Trp Asp Val Val Arg Ser Pro He Ser Phe Phe Glu Arg Thr Pro 
1025 1030 1035 1040 

He Gly His Leu Leu Asn Arg Phe Ser Lys Glu Thr Asp Thr Val Asp 

1045 1050 1055 

Val Asp He Pro Asp Lys Leu Arg Ser Leu Leu Met Tyr Ala Phe Gly 

1060 1065 1070 

Leu Leu Glu Val Ser Leu Val Val Ala Val Ala Thr Pro Leu Ala Thr 

1075 1080 1085 

Val Ala He Leu Pro Leu Phe Leu Leu Tyr Ala Gly Phe Gin Ser Leu 
1090 1095 1100 



Tyr Val Val Ser Ser Cys Gin Leu Arg Arg Leu Glu Ser Ala Ser Tyr 
1105 1110 1115 112C 

Ser Ser Val Cys Ser His Met Ala Glu Thr Phe Gin Gly Ser Thr Val 

1125 1130 1135 

Val Arg Ala Phe Arg Thr Gin Ala Pro Phe Val Ala Gin Asn Asn Ala 

1140 1145 1150 

Arg Val Asp Glu Ser Gin Arg lie Ser Phe Pro Arg Leu Val Ala Asp 

1155 1160 1165 

Arg Trp Leu Ala Ala Asn Val Glu Leu Leu Gly Asn Gly Leu Val Phe 

1170 1175 1180 

Ala Ala Ala Thr Cys Ala Val Leu Ser Lys Ala His Leu Ser Ala Gly 
1185 1190 1195 120( 

Leu Val Gly Phe Ser Val Ser Ala Ala Leu Gin Val Thr Gin Ala Leu 

1205 1210 1215 

Gin Trp Val Val Arg Asn Trp Thr Asp Leu Glu Asn Ser lie Val Ser 

1220 1225 1230 

Val Glu Arg Met Gin Asp Tyr Ala Trp Thr Pro Lys Glu Ala Pro Trp 

1235 1240 1245 

Arg Leu Pro Thr Cys Ala Ala Gin Pro Pro Trp Pro Gin Gly Gly Gin 

1250 1255 1260 

lie Glu Phe Arg Asp Phe Gly Leu Arg Tyr Arg Pro Glu Leu Pro Leu 
1265 1270 1275 128( 

Ala Val Gin Gly Val Ser Leu Lys lie His Ala Gly Glu Lys Val Gly 

1285 1290 1295 

lie Val Gly Arg Thr Gly Ala Gly Lys Ser Ser Leu Ala Ser Gly Leu 

1300 1305 1310 

Leu Arg Leu Gin Glu Ala Ala Glu Gly Gly lie Trp lie Asp Gly Val 

1315 1320 1325 

Pro lie Ala His Val Gly Leu His Thr Leu Arg Ser Arg lie Ser lie 

1330 1335 1340 

lie Pro Gin Asp Pro lie Leu Phe Pro Gly Ser Leu Arg Met Asn Leu 
1345 1350 1355 1361 

Asp Leu Leu Gin Glu His Ser Asp Glu Ala lie Trp Ala Ala Leu Glu 

1365 1370 1375 

Thr Val Gin Leu Lys Ala Leu Val Ala Ser Leu Pro Gly Gin Leu Gin 

1380 1385 1390 

Tyr Lys Cys Ala Asp Arg Gly Glu Asp Leu Ser Val Gly Gin Lys Gin 

1395 1400 1405 

Leu Leu Cys Leu Ala Arg Ala Leu Leu Arg Lys Thr Gin lie Leu lie 

1410 1415 1420 

Leu Asp Glu Ala Thr Ala Ala Val Asp Pro Gly Thr Glu Leu Gin Met 
1425 1430 1435 1441 

Gin Ala Met Leu Gly Ser Trp Phe Ala Gin Cys Thr Val Leu Leu lie 

1445 1450 1455 

Ala His Arg Leu Arg Ser Val Met Asp Cys Ala Arg Val Leu Val Met 

1460 1465 1470 

Asp Lys Gly Gin Val Ala Glu Ser Gly Ser Pro Ala Gin Leu Leu Ala 

1475 1480 1485 

Gin Lys Gly Leu Phe Tyr Arg Leu Ala Gin Glu Ser Gly Leu Val 
1490 1495 1500 



<210> 9 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 



<220> 
<221> 
<222> 
<223> 



misc_f eature 
(3) . . . (15) 
d = a, g or t 



<220> 

<221> mis cofeature 

<222> (18) ... (18) 

<2 2 3 > n = a, c, g or t 

<400> 9 
ctdgtdgcdg tdgtdggn 

<210> 10 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source: /note- 11 synthetic construct 

<400> 10 
atggccgcgc ctgctgagc 

<210> 11 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct 

<400> 11 
gtctacgaca ccagggtcaa 

<210> 12 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct 

<400> 12 
ctgcctggaa gaagttgacc 

<210> 13 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct 

<400> 13 
ctggaatgtc cacgtcaacc 

<210> 14 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Sequence source : /note= " synthetic construct 



<400> 14 
ggagacagac acggttgacg 



<210> 15 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /not e= " synthetic construct" 

<400> 15 
gcagaccagg cctgactcc 



<210> 16 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note- " synthetic construct" 
<220> 

<221> misc_f eature 
<222> (1) ... (22) 
<223> r = a or g 

<220> 

<221> mi sc__ feature 

<222> (4) ... (19) 

<223> n = a, c, g or t 

<220> 

<221> misc_f eature 
<222> (6) ... (6) 
<223> v = a, c or g 

<220> 

<221> misc_f eature 
<222> (11) ... (11) 
<223> s = c or g 

<220> 

<221> mi sc_ feature 
<222> (12) ... (12) 
<223> w = a or t 

<400> 16 
rctnavngcn swnarnggnt crtc 



<210> 17 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 
<220> 

<221> mi sc_f eature 

<222> (11) ... (14) 

<223> r = a or g 



O \5 €;}3 ::L H-O O S: '7 TO 13 



<220> 

<221> misc_f eature 
<222> (17) . . . (17) 
<223> y = c or t 

<220> 

<221> misc_f eature 

<222> (20) . . . (20) 

<223> h = a, c or t 

<220> 

<221> misc_feature 

<222> (23) . . . (29) 

< 2 2 3 > n = a, c, g or t 

<400> 17 

cgggatccag rgaraayath ctntttggn 2 9 



<210> 18 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 
<220> 

<221> misc„f eature 

<222> (9) ... (18) 

< 2 2 3 > n = a, c, g or t 

<220> 

<221> misc_f eature 
<222> (12) ... (27) 
<223> r = a or g 

<220> 

<221> mi sc_ feature 
<222> (15) . . . (15) 
<223> h - a, c or t 

<220> 

<221> misc_f eature 

<222> (24) ... (24) 

<223> d - a, g or t 



<400> 18 
cggaattcnt crtchagnag rtadatrtc 



29 
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SEQUENCE LISTING 
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<130> FCCC 98-02 

<150> 60/079,759 
<151> 1998-03-27 

<150> 60/095, 153 
<151> 1998-08-03 

<160> 18 

<170> FastSEQ for Windows Version 3.0 

<210> 1 

<211> 4231 

<212> DNA 

<213> Homo sapiens 



<400> 1 

ggacaggcgt ggcggccgga gccccagcat ccctgcttga ggtccaggag cggagcccgc 60 

ggccaccgcc gcctgatcag cgcgaccccg gcccgcgccc gccccgcccg gcaagatgct 120 

gcccgtgtac caggaggtga agcccaaccc gctgcaggac gcgaacatct gctcacgcgt 180 

gttcttctgg tggctcaatc ccttgtttaa aattggccat aaacggagat tagaggaaga 240 

tgatatgtat tcagtgctgc cagaagaccg ctcacagcac cttggagagg agttgcaagg 300 

gttctgggat aaagaagttt taagagctga gaatgacgca cagaagcctt ctttaacaag 3 60 

agcaatcata aagtgttact ggaaatctta tttagttttg ggaattttta cgttaattga 420 

ggaaagtgcc aaagtaatcc agcccatatt tttgggaaaa attattaatt attttgaaaa 480 

ttatgatccc atggattctg tggctttgaa cacagcgtac gcctatgcca cggtgctgac 540 

tttttgcacg ctcattttgg ctatactgca tcacttatat ttttatcacg ttcagtgtgc 600 

tgggatgagg ttacgagtag ccatgtgcca tatgatttat cggaaggcac ttcgtcttag 660 

taacatggcc atggggaaga caaccacagg ccagatagtc aatctgctgt ccaatgatgt 720 

gaacaagttt gatcaggtga cagtgttctt acacttcctg tgggcaggac cactgcaggc 780 

gatcgcagtg actgccctac tctggatgga gataggaata tcgtgccttg ctgggatggc 840 

agttctaatc attctcctgc ccttgcaaag ctgttttggg aagttgttct catcactgag 900 

gagtaaaact gcaactttca cggatgccag gatcaggacc atgaatgaag ttataactgg 9 60 

tataaggata ataaaaatgt acgcctggga aaagtcattt tcaaatctta ttaccaattt 1020 

gagaaagaag gagatttcca agattctgag aagttcctgc ctcaggggga tgaatttggc 10 80 

ttcgtttttc agtgcaagca aaatcatcgt gtttgtgacc ttcaccacct acgtgctcct 1140 

cggcagtgtg atcacagcca gccgcgtgtt cgtggcagtg acgctgtatg gggctgtgcg 1200 

gctgacggtt accctcttct tcccctcagc cattgagagg gtgtcagagg caatcgtcag 12 60 

catccgaaga atccagacct ttttgctact tgatgagata tcacagcgca accgtcagct 1320 

gccgtcagat ggtaaaaaga tggtgcatgt gcaggatttt actgcttttt gggataaggc 1380 

atcagagacc ccaactctac aaggcctttc ctttactgtc agacctggcg aattgttagc 1440 

tgtggtcggc cccgtgggag cagggaagtc atcactgtta agtgccgtgc tcggggaatt 1500 

ggccccaagt cacgggctgg tcagcgtgca tggaagaatt gcctatgtgt ctcagcagcc 1560 

ctgggtgttc tcgggaactc tgaggagtaa tattttattt gggaagaaat atgaaaagga 1620 

acgatatgaa aaagtcataa aggcttgtgc tctgaaaaag gatttacagc tgttggagga 1680 

tggtgatctg actgtgatag gagatcgggg aaccacgctg agtggagggc agaaagcacg 1740 

ggtaaacctt gcaagagcag tgtatcaaga tgctgacatc tatctcctgg acgatcctct 1800 

cagtgcagta gatgcggaag ttagcagaca cttgttcgaa ctgtgtattt gtcaaatttt 1860 

gcatgagaag atcacaattt tagtgactca tcagttgcag tacctcaaag ctgcaagtca 1920 

gattctgata ttgaaagatg gtaaaatggt gcagaagggg acttacactg agttcctaaa 1980 

atctggtata gattttggct cccttttaaa gaaggataat gaggaaagtg aacaacctcc 2040 

agttccagga actcccacac taaggaatcg taccttctca gagtcttcgg tttggtctca 2100 
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acaatctt ct 
cccagttaca 
gaattacttc 
tgcagctcag 
aagtatgcta 
ctggtactta 
a tctctat tg 
tgagtcaat t 
aaatcgtttc 
tttcatccag 
ttggatcgca 
1 1 tggaaacg 
ccact tgtca 
g tgtcaggaa 
gacaacgtcc 
cgttgccttt 
actgtcctat 
agttgagaat 
agcacct tgg 
ct ttgacaat 
agcactcat t 
t tccctcat c 
gate ttgaca 
ggaacctgt t 
ggatgaggaa 
tcctggtaaa 
acaactggtg 
agegaeggea 
atttgcccac 
caagataatg 
gcaaaataaa 
tgccctcact 
cac tgaccac 
gacagcactg 
t ttggactat 
caagatgeta 



agaccctcct 
ctatcagagg 
agagctggtg 
gt tgcctatg 
aatgtcactg 
ggaatttatt 
gtat tctacg 
ctgaaagc tc 
tccaaagaca 
acattgetae 
atacccttgg 
tcaagagatg 
tct tctctcc 
ctgt ttgatg 
cgctggt teg 
gggtccctga 
gccctcacgc 
atgatgatct 
gaatatcaga 
gtgaacttca 
aaatcacaag 
tcagcccttt 
actgaaattg 
ttgttcactg 
ctgtggaatg 
atggatactg 
tgccttgcca 
aatgtggatc 
tgcaccgtgc 
gttttagatt 
gagagectat 
gaaacagcaa 
atggttacaa 
tgaatccaac 
gtaaaccaca 
gttcatttga 



tgaaagatgg 
agaacegtte 
ctcactggat 
tgcttcaaga 
taaatggagg 
caggt ttaac 
tccttgttaa 
eggtattat t 
ttggacactt 
aagtggttgg 
ttccccttgg 
tgaagegect 
aggggctctg 
cacaccagga 
ccgtccgtct 
ttctggcaaa 
tcatggggat 
cagtagaaag 
aacgcccacc 
tgtacagtcc 
aaaaggttgg 
ttagattgtc 
gacttcacga 
gaacaatgag 
cct tacaaga 
aattagcaga 
gggcaattc t 
caagaactga 
taaccattgc 
caggaagact 
tt tacaaga t 
aacaggtata 
acacttccaa 
caaaatgtca 
ttgtactttt 
atatt tctcc 



tgctctggag 
tgaaggaaaa 
tgtcttcatt 
ttggtggctt 
aggaaatgta 
tgtagc tacc 
ctct tcacaa 
ct ttgataga 
ggatgatttg 
tgtggtc tct 
aatcatt t tc 
ggaat ctaca 
gaccatccgg 
tt tacat tea 
ggatgecate 
aactctggat 
gtttcagtgg 
ggtcat tgaa 
accagcc tgg 
aggtgggcct 
cattgtggga 
agaacccgaa 
tt taaggaag 
gaaaaacctg 
ggtacaac t t 
atcaggatcc 
caggaaaaat 
tgagttaata 
acacagattg 
gaaagaatat 
ggtgcaacaa 
ct tcaaaaga 
tggacagccc 
agtccgt tec 
ttttactttg 
c 



agecaagata 
gttggttttc 
ttccttattc 
tcatactggg 
accgagaagc 
gttctttttg 
act t tgcaca 
aatccaatag 
ctgccgctga 
gtggctgtgg 
atttttcttc 
acteggagtc 
gcatacaaag 
gaggcttggt 
tgtgccatgt 
geegggcagg 
tgtgttcgac 
tacacagacc 
ccccatgaag 
ctggtactga 
agaaceggag 
ggtaaaatt t 
aaaatgtcaa 
gatccct t ta 
aaagaaacca 
aattttagtg 
cagatattga 
caaaaaaaaa 
aacaccat ta 
gatgagcegt 
ctgggcaagg 
aat tatccac 
tcgaccttaa 
gaaggcatt t 
gcaacaaata 



cagagaatgt 
aggectataa 
tcctaaacac 
caaacaaaca 
tagatcttaa 
gcatagcaag 
acaaaatgtt 
gaagaattt t 
cgtt tttaga 
ccgtgattcc 
ggegatat t t 
cagtgttttc 
cagaagagag 
tcttgttttt 
ttgtcatcat 
ttggtttggc 
aaagtgc tga 
ttgaaaaaga 
gagtgataat 
agcatctgac 
ctggaaaaag 
ggat tgataa 
tcatacctca 
aggagcacac 
ttgaagatct 
ttggacaaag 
ttattgatga 
teegggagaa 
ttgacagega 
atgttttgct 
cagaagccgc 
atat tggtca 
ctatt ttcga 
tccactagt t 
tt tatacata 



2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4231 



<210> 2 

<211> 1325 

<212> PRT 

<213> Homo 



<400> 2 



Met 


Leu 


Pro 


Val 


1 








Asn 


He 


Cys 


Ser 








20 


lie 


Gly 


His 


Lys 






35 




Pro 


Glu 


Asp 


Arg 




50 






Asp 


Lys 


Glu 


Val 


65 








Thr 


Arg 


Ala 


He 


He 


Phe 


Thr 


Leu 








100 


Leu 


Gly 


Lys 


He 






115 




Val 


Ala 


Leu 


Asn 




130 






Thr 


Leu 


He 


Leu 


145 








Cys 


Ala 


Gly 


Met 


Lys 


Ala 


Leu 


Arg 



sapiens 



Tyr 


Gin 


Glu 


Val 


5 








Arg 


Val 


Phe 


Phe 


Arg 


Arg 


Leu 


Glu 








40 


Ser 


Gin 


His 


Leu 






55 




Leu 


Arg 


Ala 


Glu 




70 






He 


Lys 


Cys 


Tyr 


85 








He 


Glu 


Glu 


Ser 


He 


Asn 


Tyr 


Phe 








120 


Thr 


Ala 


Tyr 


Ala 






135 




Ala 


He 


Leu 


His 




150 






Arg 


Leu 


Arg 


Val 


165 








Leu 


Ser 


Asn 


Met 



Lys 


Pro 


Asn 


Pro 




10 






Trp 


Trp 


Leu 


Asn 


25 








Glu 


Asp 


Asp 


Met 


Gly 


Glu 


Glu 


Leu 








60 


Asn 


Asp 


Ala 


Gin 






75 




Trp 


Lys 


Ser 


Tyr 




90 






Ala 


Lys 


Val 


He 


105 








Glu 


Asn 


Tyr 


Asp 


Tyr 


Ala 


Thr 


Val 








140 


His 


Leu 


Tyr 


Phe 






155 




Ala 


Met 


Cys 


His 




170 






Ala 


Met 


Gly 


Lys 



Leu 


Gin 


Asp 


Ala 






15 




Pro 


Leu 


Phe 


Lys 




30 






Tyr 


Ser 


Val 


Leu 


45 








Gin 


Gly 


Phe 


Trp 


Lys 


Pro 


Ser 


Leu 








80 


Leu 


Val 


Leu 


Gly 






95 




Gin 


Pro 


He 


Phe 




110 






Pro 


Met 


Asp 


Ser 


125 








Leu 


Thr 


Phe 


Cys 


Tyr 


His 


Val 


Gin 








160 


Met 


He 


Tyr 


Arg 






175 




Thr 


Thr 


Thr 


Gly 
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180 



Gin 


He 


Val 


Asn 


Leu 


Leu 


Ser 


Asn 






195 










200 


Thr 


Val 


Phe 


Leu 


His 


Phe 


Leu 


Trp 




210 










215 




Val 


Thr 


Ala 


Leu 


Leu 


Trp 


Met 


Glu 


225 










230 






Met 


Ala 


Val 


Leu 


He 


He 


Leu 


Leu 










245 








Leu 


Phe 


Ser 


Ser 


Leu 


Arg 


Ser 


Lys 








260 










He 


Arg 


Thr 


Met 


Asn 


Glu 


Val 


He 






275 










280 


Tyr 


Ala 


Trp 


Glu 


Lys 


Ser 


Phe 


Ser 




290 










295 




Lys 


Glu 


He 


Ser 


Lys 


He 


Leu 


Arg 


305 










310 






Leu 


Ala 


Ser 


Pha 


The 


Ser 


Ala 


Ser 










325 








Thr 


Thr 


Tyr 


Val 


Leu 


Leu 


Gly 


Ser 








340 










Val 


Ala 


Val 


Thr 


Leu 


Tyr 


Gly 


Ala 






355 










360 


Phe 


Pro 


Ser 


Ala 


He 


Glu 


Arg 


Val 




370 










375 




Arg 


He 


Gin 


Thr 


Phe 


Leu 


Leu 


Leu 


385 










390 






Gin 


Leu 


Pro 


Ser 


Asp 


Gly 


Lys 


Lys 










405 








Ala 


Phe 


Trp 


Asp 


Lys 


Ala 


Ser 


Glu 








420 










Phe 


Thr 


Val 


Arg 


Pro 


Gly 


Glu 


Leu 






435 










440 


Ala 


Gly 


Lys 


Ser 


Ser 


Leu 


Leu 


Ser 




450 










455 




Ser 


His 


Gly 


Leu 


Val 


Ser 


Val 


His 


465 










470 






Gin 


Pro 


Trp 


Val 


Phe 


Ser 


Gly 


Thr 










485 








Lys 


Lys 


Tyr 


Glu 


Lys 


Glu 


Arg 


Tyr 








500 










Leu 


Lys 


Lys 


Asp 


Leu 


Gin 


Leu 


Leu 






515 










520 


Gly 


Asp 


Arg 


Gly 


Thr 


Pro 


Leu 


Ser 




530 










535 




Leu 


Ala 


Arg 


Ala 


Val 


Tyr 


Gin 


Asp 


545 










550 






Pro 


Leu 


Ser 


Ala 


Val 


Asp 


Ala 


Glu 










565 








Cys 


He 


Cys 


Gin 


He 


Leu 


His 


Glu 








580 










Gin 


Leu 


Gin 


Tyr 


Leu 


Lys 


Ala 


Ala 






595 










600 


Gly 


Lys 


Met 


Val 


Gin 


Lys 


Gly 


Thr 




610 










615 




He 


Asp 


Phe 


Gly 


Ser 


Leu 


Leu 


Lys 


625 










630 






Pro 


Pro 


Val 


Pro 


Gly 


Thr 


Pro 


Thr 










645 








Ser 


Ser 


Val 


Trp 


Ser 


Gin 


Gin 


Ser 








660 










Ala 


Leu 


Glu 


Ser 


Gin 


Asp 


Thr 


Glu 






675 










680 


Glu 


Asn 


Arg 


Ser 


Glu 


Gly 


Lys 


Val 




690 










695 




Phe 


Arg 


Ala 


Gly 


Ala 


His 


Trp 


He 
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185 










190 






Asp 


Val 


Asn 


Lys 


Phe 


Asp 


Gin 


Val 










205 








Ala 


Gly 


Pro 


Leu 


Gin 


Ala 


He 


Ala 








220 










He 


Gly 


He 


Ser 


Cys 


Leu 


Ala 


Gly 






235 










240 


Pro 


Leu 


Gin 


Ser 


Cys 


Phe 


Gly 


Lys 




250 










255 




Thr 


Ala 


Thr 


Phe 


Thr 


Asp 


Ala 


Arg 


265 










270 






Thr 


Gly 


He 


Arg 


He 


He 


Lys 


Met 










285 








Asn 


Leu 


He 


Thr 


Asn 


Leu 


Arg 


Lys 








300 










Ser 


Ser 


Cys 


Leu 


Arg 


Gly 


Met 


Asn 






315 










320 


Lys 


He 


He 


Val 


Phe 


Val 


Thr 


Phe 




330 










335 




Val 


He 


Thr 


Ala 


Ser 


Arg 


Val 


Phe 


345 










350 






Val 


Arg 


Leu 


Thr 


Val 


Thr 


Leu 


Phe 










365 








Ser 


Glu 


Ala 


He 


Val 


Ser 


He 


Arg 








380 










Asp 


Glu 


He 


Ser 


Gin 


Arg 


Asn 


Arg 






395 










400 


Met 


Val 


His 


Val 


Gin 


Asp 


Phe 


Thr 




410 










415 




Thr 


Pro 


Thr 


Leu 


Gin 


Gly 


Leu 


Ser 


425 










430 






Leu 


Ala 


Val 


Val 


Gly 


Pro 


Val 


Gly 










445 








Ala 


Val 


Leu 


Gly 


Glu 


Leu 


Ala 


Pro 








460 










Gly 


Arg 


He 


Ala 


Tyr 


Val 


Ser 


Gin 






475 










480 


Leu 


Arg 


Ser 


Asn 


He 


Leu 


Phe 


Gly 




490 










495 




Glu 


Lys 


Val 


He 


Lys 


Ala 


Cys 


Ala 


505 










510 






Glu 


Asp 


Gly 


Asp 


Leu 


Thr 


Val 


He 










525 








Gly 


Gly 


Gin 


Lys 


Ala 


Arg 


Val 


Asn 








540 










Ala 


Asp 


He 


Tyr 


Leu 


Leu 


Asp 


Asp 






555 










560 


Val 


Ser 


Arg 


His 


Leu 


Phe 


Glu 


Leu 




570 










575 




Lys 


He 


Thr 


He 


Leu 


Val 


Thr 


His 


585 










590 






Ser 


Gin 


He 


Leu 


He 


Leu 


Lys 


Asp 










605 








Tyr 


Thr 


Glu 


Phe 


Leu 


Lys 


Ser 


Gly 








620 










Lys 


Asp 


Asn 


Glu 


Glu 


Ser 


Glu 


Gin 






635 










640 


Leu 


Arg 


Asn 


Arg 


Thr 


Phe 


Ser 


Glu 




650 










655 




Ser 


Arg 


Pro 


Ser 


Leu 


Lys 


Asp 


Gly 


665 










670 






Asn 


Val 


Pro 


Val 


Thr 


Leu 


Ser 


Glu 










685 








Gly 


Phe 


Gin 


Ala 


Tyr 


Lys 


Asn 


Tyr 








700 










Val 


Phe 


He 


Phe 


Leu 


He 


Leu 


Leu 
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705 








710 










715 










720 


Asn Thr 


Ala 


Ala 


Gin 


Val 


Ala 


Tyr 


Val 


Leu 


Gin 


Asp 


Trp 


Trp 


Leu 


Ser 








725 










730 










735 




Tyr Trp 


Ala 


Asn 


Lys 


Gin 


Ser 


Met 


Leu 


Asn 


Val 


Thr 


Val 


Asn 


Gly 


Gly 






740 










745 










750 






Gly Asn 


Val 


Thr 


Glu 


Lys 


Leu 


Asp 


Leu 


Asn 


Trp 


Tyr 


Leu 


Gly 


He 


Tyr 




755 










760 










765 








Ser Gly 


Leu 


Thr 


Val 


Ala 


Thr 


Val 


Leu 


Phe 


Gly 


He 


Ala 


Arg 


Ser 


Leu 


770 










775 










780 










Leu Val 


Phe 


Tyr 


Val 


Leu 


Val 


Asn 


Ser 


Ser 


Gin 


Thr 


Leu 


His 


Asn 


Lys 


785 








790 










795 










800 


Met Phe 


Glu 


Ser 


He 


Leu 


Lys 


Ala 


Pro 


Val 


Leu 


Phe 


Phe 


Asp 


Arg 


Asn 








805 










810 










815 




Pro lie 


Gly 


Arg 


He 


Leu 


Asn 


Arg 


Phe 


Ser 


Lys 


Asp 


He 


Gly 


His 


Leu 






820 










825 










830 






Asp Asp 


Leu 


Leu 


Pro 


Leu 


Thr 


Phe 


Leu 


Asp 


Phe 


He 


Gin 


Thr 


Leu 


Leu 




835 










840 










845 








Gin Val 


Val 


Gly 


Val 


Val 


Ser 


Val 


Ala 


Val 


Ala 


Val 


He 


Pro 


Trp 


He 


850 










855 










860 










Ala He 


Pro 


Leu 


Val 


Pro 


Leu 


Gly 


He 


He 


Phe 


He 


Phe 


Leu 


Arg 


Arg 


865 








870 










875 










880 


Tyr Phe 


Leu 


Glu 


Thr 


Ser 


Arg 


Asp 


Val 


Lys 


Arg 


Leu 


Glu 


Ser 


Thr 


Thr 








885 










890 










895 




Arg Ser 


Pro 


Val 


Phe 


Ser 


His 


Leu 


Ser 


Ser 


Ser 


Leu 


Gin 


Gly 


Leu 


Trp 






900 










905 










910 






Thr He 


Arg 


Ala 


Tyr 


Lys 


Ala 


Glu 


Glu 


Arg 


Cys 


Gin 


Glu 


Leu 


Phe 


Asp 




915 










920 










925 








Ala His 


Gin 


Asp 


Leu 


His 


Ser 


Glu 


Ala 


Trp 


Phe 


Leu 


Phe 


Leu 


Thr 


Thr 


930 










935 










940 










Ser Arg 


Trp 


Phe 


Ala 


Val 


Arg 


Leu 


Asp 


Ala 


He 


Cys 


Ala 


Met 


Phe 


Val 


945 








950 










955 










960 


He He 


Val 


Ala 


Phe 


Gly 


Ser 


Leu 


He 


Leu 


Ala 


Lys 


Thr 


Leu 


Asp 


Ala 








965 










970 










975 




Gly Gin 


Val 


Gly 


Leu 


Ala 


Leu 


Ser 


Tyr 


Ala 


Leu 


Thr 


Leu 


Met 


Gly 


Met 






980 










985 










990 






Phe Gin 


Trp 


Cys 


Val 


Arg 


Gin 


Ser 


Ala 


Glu 


Val 


Glu 


Asn 


Met 


Met 


He 




995 










1000 








1005 






Ser Val 


Glu 


Arg 


Val 


He 


Glu 


Tyr 


Thr 


Asp 


Leu 


Glu 


Lys 


Glu 


Ala 


Pro 


1010 








1015 








1020 








Trp Glu 


Tyr 


Gin 


Lys 


Arg 


Pro 


Pro 


Pro 


Ala 


Trp 


Pro 


His 


Glu 


Gly 


Val 


1025 








1030 








1035 








1040 


He He 


Phe 


Asp 


Asn 


Val 


Asn 


Phe 


Met 


Tyr 


Ser 


Pro 


Gly 


Gly 


Pro 


Leu 








1045 








1050 








1055 


Val Leu 


Lys 


His 


Leu 


Thr 


Ala 


Leu 


He 


Lys 


Ser 


Gin 


Glu 


Lys 


Val 


Gly 






1060 








1065 








1070 




He Val 


Gly 


Arg 


Thr 


Gly 


Ala 


Gly 


Lys 


Ser 


Ser 


Leu 


He 


Ser 


Ala 


Leu 




1075 








1080 








1085 






Phe Arg 


Leu 


Ser 


Glu 


Pro 


Glu 


Gly 


Lys 


He 


Trp 


He 


Asp 


Lys 


He 


Leu 


1090 








1095 








1100 








Thr Thr 


Glu 


He 


Gly 


Leu 


His 


Asp 


Leu 


Arg 


Lys 


Lys 


Met 


Ser 


He 


He 


1105 








1110 








1115 








1120 


Pro Gin 


Glu 


Pro 


Val 


Leu 


Phe 


Thr 


Gly 


Thr 


Met 


Arg 


Lys 


Asn 


Leu 


Asp 








1125 








1130 








1135 


Pro Phe 


Lys 


Glu 


His 


Thr 


Asp 


Glu 


Glu 


Leu 


Trp 


Asn 


Ala 


Leu 


Arg 


Glu 






1140 








1145 








1150 




Val Gin 


Leu 


Lys 


Glu 


Thr 


He 


Glu 


Asp 


Leu 


Pro 


Gly 


Lys 


Met 


Asp 


Thr 




1155 








1160 








1165 






Glu Leu 


Ala 


Glu 


Ser 


Gly 


Ser 


Asn 


Phe 


Ser 


Val 


Gly 


Gin 


Arg 


Gin 


Leu 


1170 








1175 








1180 








Val Cys 


Leu 


Ala 


Arg 


Ala 


He 


Leu 


Arg 


Lys 


Asn 


Gin 


He 


Leu 


He 


He 


1185 








1190 








1195 








1200 


Asp Glu 


Ala 


Thr 


Ala 


Asn 


Val 


Asp 


Pro 


Arg 


Thr 


Asp 


Glu 


Leu 


He 


Gin 








1205 








1210 








1215 


Lys Lys 


He 


Arg 


Glu 


Lys 


Phe 


Ala 


His 


Cys 


Thr 


Val 


Leu 


Thr 


He 


Ala 






1220 








1225 








1230 




His Arg 


Leu 


Asn 


Thr 


He 


He 


Asp 


Ser 


Asp 


Lys 


He 


Met 


Val 


Leu 


Asp 
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1235 1240 1245 

Ser Gly Arg Leu Lys Glu Tyr Asp Glu Pro Tyr Val Leu Leu Gin Asn 

1250 1255 1260 

Lys Glu Ser Leu Phe Tyr Lys Met Val Gin Gin Leu Gly Lys Ala Glu 
1265 1270 1275 1280 

Ala Ala Ala Leu Thr Glu Thr Ala Lys Gin Val Tyr Phe Lys Arg Asn 

1285 1290 1295 

Tyr Pro His lie Gly His Thr Asp His Met Val Thr Asn Thr Ser Asn 

1300 1305 1310 

Gly Gin Pro Ser Thr Leu Thr lie Phe Glu Thr Ala Leu 
1315 1320 1325 

<210> 3 
<211> 5838 
<212> DNA 

<213> Homo sapiens 
<400> 3 

ccgggcaggt ggctcatgct cgggagcgtg gttgagcggc tggcgcggtt gtcctggagc 60 

aggggcgcag gaattctgat gtgaaactaa cagtctgtga gccctggaac ctccgctcag 120 

agaagatgaa ggatatcgac ataggaaaag agtatatcat ccccagtcct gggtatagaa 180 

gtgtgaggga gagaaccagc acttctggga cgcacagaga ccgtgaagat tccaagttca 240 

ggagaactcg accgttggaa tgccaagatg ccttggaaac agcagcccga gccgagggcc 3 00 

tctctcttga tgcctccatg cattctcagc tcagaatcct ggatgaggag catcccaagg 3 60 

gaaagtacca tcatggcttg agtgctctga agcccatccg gactacttcc aaacaccagc 420 

acccagtgga caatgctggg cttttttcct gtatgacttt ttcgtggctt tcttctctgg 480 

cccgtgtggc ccacaagaag ggggagctct caatggaaga cgtgtggtct ctgtccaagc 540 

acgagtcttc tgacgtgaac tgcagaagac tagagagact gtggcaagaa gagctgaatg 600 

aagttgggcc agacgctgct tccctgcgaa gggttgtgtg gatcttctgc cgcaccaggc 660 

tcatcctgtc catcgtgtgc ctgatgatca cgcagctggc tggcttcagt ggaccagcct 720 

tcatggtgaa acacctcttg gagtataccc aggcaacaga gtctaacctg cagtacagct 7 80 

tgttgttagt gctgggcctc ctcctgacgg aaatcgtgcg gtcttggtcg cttgcactga 840 

cttgggcatt gaattaccga accggtgtcc gcttgcgggg ggccatccta accatggcat 900 

ttaagaagat ccttaagtta aagaacatta aagagaaatc cctgggtgag ctcatcaaca 9 60 

tttgctccaa cgatgggcag agaatgtttg aggcagcagc cgttggcagc ctgctggctg 1020 

gaggacccgt tgttgccatc ttaggcatga tttataatgt aattattctg ggaccaacag 1080 

gcttcctggg atcagctgtt tttatcctct tttacccagc aatgatgttt gcatcacggc 1140 

tcacagcata tttcaggaga aaatgcgtgg ccgccacgga tgaacgtgtc cagaagatga 1200 

atgaagttct tacttacatt aaatttatca aaatgtatgc ctgggtcaaa gcattttctc 1260 

agagtgttca aaaaatccgc gaggaggagc gtcggatatt ggaaaaagcc gggtacttcc 1320 

agggtatcac tgtgggtgtg gctcccattg tggtggtgat tgccagcgtg gtgaccttct 1380 

ctgttcatat gaccctgggc ttcgatctga cagcagcaca ggctttcaca gtggtgacag 1440 

tcttcaattc catgactttt gctttgaaag taacaccgtt ttcagtaaag tccctctcag 1500 

aagcctcagt ggctgttgac agatttaaga gtttgtttct aatggaagag gttcacatga 1560 

taaagaacaa accagccagt cctcacatca agatagagat gaaaaatgcc accttggcat 1620 

gggactcctc ccactccagt atccagaact cgcccaagct gacccccaaa atgaaaaaag 1680 

acaagagggc ttccaggggc aagaaagaga aggtgaggca gctgcagcgc actgagcatc 1740 

aggcggtgct ggcagagcag aaaggccacc tcctcctgga cagtgacgag cggcccagtc 1800 

ccgaagagga agaaggcaag cacatccacc tgggccacct gcgcttacag aggacactgc 18 60 

acagcatcga tctggagatc caagagggta aactggttgg aatctgcggc agtgtgggaa 1920 

gtggaaaaac ctctctcatt tcagccattt taggccagat gacgcttcta gagggcagca 1980 

ttgcaatcag tggaaccttc gcttatgtgg cccagcaggc ctggatcctc aatgctactc 2040 

tgagagacaa catcctgttt gggaaggaat atgatgaaga aagatacaac tctgtgctga 2100 

acagctgctg cctgaggcct gacctggcca ttcttcccag cagcgacctg acggagattg 2160 

gagagcgagg agccaacctg agcggtgggc agcgccagag gatcagcctt gcccgggcct 2220 

tgtatagtga caggagcatc tacatcctgg acgaccccct cagtgcctta gatgcccatg 2280 

tgggcaacca catcttcaat agtgctatcc ggaaacatct caagtccaag acagttctgt 2340 

ttgttaccca ccagttacag tacctggttg actgtgatga agtgatcttc atgaaagagg 2400 

gctgtattac ggaaagaggc acccatgagg aactgatgaa tttaaatggt gactatgcta 2460 

ccatttttaa taacctgttg ctgggagaga caccgccagt tgagatcaat tcaaaaaagg 2520 

aaaccagtgg ttcacagaag aagtcacaag acaagggtcc taaaacagga tcagtaaaga 2580 

aggaaaaagc agtaaagcca gaggaagggc agcttgtgca gctggaagag aaagggcagg 2 640 

gttcagtgcc ctggtcagta tatggtgtct acatccaggc tgctgggggc cccttggcat 2700 

tcctggttat tatggccctt ttcatgctga atgtaggcag caccgccttc agcacctggt 2760 

ggttgagtta ctggatcaag caaggaagcg ggaacaccac tgtgactcga gggaacgaga 2820 

cctcggtgag tgacagcatg aaggacaatc ctcatatgca gtactatgcc agcatctacg 2880 
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ccctctccat ggcagtcatg ctgatcctga aagccattcg aggagttgtc tttgtcaagg 2940 

gcacgctgcg agcttcctcc cggctgcatg acgagctttt ccgaaggatc cttcgaagcc 3000 

ctatgaagtt ttttgacacg acccccacag ggaggattct caacaggttt tccaaagaca 3060 

tggatgaagt tgacgtgcgg ctgccgttcc aggccgagat gttcatccag aacgttatcc 3120 

tggtgttctt ctgtgtggga atgatcgcag gagtcttccc gtggttcctt gtggcagtgg 3180 

ggccccttgt catcctcttt tcagtcctgc acattgtctc cagggtcctg attcgggagc 3240 

tgaagcgtct ggacaatatc acgcagtcac ctttcctctc ccacatcacg tccagcatac 3300 

agggccttgc caccatccac gcctacaata aagggcagga gtttctgcac agataccagg 33 60 

agctgctgga tgacaaccaa gctccttttt ttttgtttac gtgtgcgatg cggtggctgg 3420 

ctgtgcggct ggacctcatc agcatcgccc tcatcaccac cacggggctg atgatcgttc 3480 

ttatgcacgg gcagattccc ccagcctatg cgggtctcgc catctcttat gctgtccagt 3540 

taacggggct gttccagttt acggtcagac tggcatctga gacagaagct cgattcacct 3 600 

cggtggagag gatcaatcac tacattaaga ctctgtcctt ggaagcacct gccagaatta 3 660 

agaacaaggc tccctcccct gactggcccc aggagggaga ggtgaccttt gagaacgcag 3720 

agatgaggta ccgagaaaac ctccctcttg tcctaaagaa agtatccttc acgatcaaac 3780 

ctaaagagaa gattggcatc gtggggcrgga caggatcagg gaagtcctcg ctggggatgg 3 840 

ccctcttccg tctggtggag ttatctggag gctgcatcaa gattgatgga gtgagaatca 3900 

gtgatattgg ccttgrcgac ctccgaagca aactctctat cattcctcaa gagccggtgc 3960 

tgttcagtgg cactgtcaga tcaaatttgg accccttcaa ccagtacact gaagaccaga 4020 

tttgggatgc cctggagagg acacacatga aagaatgtat tgctcagcta cctctgaaac 4080 

ttgaatctga agtgatggag aatggggata acttctcagt gggggaacgg cagctcttgt 4140 

gcatagctag agccctgctc cgccactgta agattctgat tttagatgaa gccacagctg 4200 

ccatggacac agagacagac ttattgattc aagagaccat ccgagaagca tttgcagact 4260 

gtaccatgct gaccattgcc catcgcctgc acacggttct aggctccgat aggattatgg 4320 

tgctggccca gggacaggtg gtggagtttg acaccccatc ggtccttctg tccaacgaca 4380 

gttcccgatt ctatgccatg tttgctgctg cagagaacaa ggtcgctgtc aagggctgac 4440 

tcctccctgt tgacgaagtc tcttttcttt agagcattgc cattccctgc ctggggcggg 4500 

cccctcatcg cgtcctccta ccgaaacctt gcctttctcg attttatctt tcgcacagca 4560 

gttccggatt ggcttgtgtg tttcactttt agggagagtc atattttgat tattgtattt 4620 

attccatatt catgtaaaca aaatttagtt tttgttctta attgcactct aaaaggttca 4680 

gggaaccgtt attataattg tatcagaggc ctataatgaa gctttatacg tgtagctata 47 40 

tctatatata attctgtaca tagcctatat ttacagtgaa aatgtaagct gtttatttta 4800 

tattaaaata agcactgtgc taataacagt gcatattcct ttctatcatt tttgtacagt 4860 

ttgctgtact agagatctgg ttttgctatt agactgtagg aagagtagca tttcattctt 4920 

ctctagctgg tggtttcacg gtgccaggtt ttctgggtgt ccaaaggaag acgtgtggca 4980 

atagtgggcc ctccgacagc cccctctgcc gcctccccac agccgctcca ggggtggctg 5040 

gagacgggtg ggcggctgga gaccatgcag agcgccgtga gttctcaggg ctcctgcctt 5100 

ctgtcctggt gtcacttact gtttctgtca ggagagcagc ggggcgaagc ccaggcccct 5160 

tttcactccc tccatcaaga atggggatca cagagacatt cctccgagcc ggggagtttc 5220 

tttcctgcct tcttcttttt gctgttgttt ctaaacaaga atcagtctat ccacagagag 5280 

tcccactgcc tcaggttcct atggctggcc actgcacaga gctctccagc tccaagacct 5340 

gttggttcca agccctggag ccaactgctg ctttttgagg tggcactttt tcatttgcct 5400 

attcccacac ctccacagtt cagtggcagg gctcaggatt tcgtgggtct gttttccttt 5460 

ctcaccgcag tcgtcgcaca gtctctctct ctctctcccc tcaaagtctg caactttaag 5520 

cagctcttgc taatcagtgt ctcacactgg cgtagaagtt tttgtactgt aaagagacct 5580 

acctcaggtt gctggttgct gtgtggtttg gtgtgttccc gcaaaccccc tttgtgctgt 5640 

gggg^tggta gctcaggtgg gcgtggtcac tgctgtcatc agttgaatgg tcagcgttgc 5700 

atgtcgtgac caactagaca ttctgtcgcc ttagcatgtt tgctgaacac cttgtggaag 57 60 

caaaaatctg aaaatgtgaa taaaattatt ttggattttg taaaaaaaaa aaaaaaaaaa 5820 

aaaaaaaaaa aaaaaaaa 583 8 





<210> 


4 










<211> 


1437 










<212> 


PRT 










<213> 


Homo sapiens 






<400> 


4 








Met 


Lys Asp 


He Asp 


He 


Gly 


Lys 


1 




5 








Tyr 


Arg Ser 


Val Arg 


Glu 


Arg 


Thr 






20 








Arg 


Glu Asp 


Ser Lys 


Phe 


Arg 


Arg 




35 








40 


Ala 


Leu Glu 


Thr Ala 


Ala 


Arg 


Ala 



Glu 


Tyr 


He 


He 


Pro 


Ser 


Pro 


Gly 




10 










15 




Ser 


Thr 


Ser 


Gly 


Thr 


His 


Arg 


Asp 


25 










30 






Thr 


Arg 


Pro 


Leu 


Glu 


Cys 


Gin 


Asp 










45 








Glu 


Gly 


Leu 


Ser 


Leu 


Asp 


Ala 


Ser 
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50 

Met His Ser Gin 
65 

Tyr His His Gly 

His Gin His Pro 
100 

Ser Trp Leu Ser 
115 

Ser Met Glu Asp 
130 

Asn Cys Arg Arg 
145 

Gly Pro Asp Ala 

Thr Arg Leu lie 
180 

Gly Phe Ser Gly 
195 

Gin Ala Thr Glu 
210 

L eu L eu L eu Thr 
225 

Ala Leu Asn Tyr 

Met Ala Phe Lys 
260 

Leu Gly Glu Leu 
275 

Glu Ala Ala Ala 
290 

He Leu Gly Met 
305 

Leu Gly Ser Ala 

Ser Arg Leu Thr 
340 

Glu Arg Val Gin 
355 

Lys Met Tyr Ala 
370 

Arg Glu Glu Glu 
385 

He Thr Val Gly 

Thr Phe Ser Val 
420 

Ala Phe Thr Val 
435 

Val Thr Pro Phe 
450 

Asp Arg Phe Lys 
465 

Asn Lys Pro Ala 

Leu Ala Trp Asp 
500 

Thr Pro Lys Met 
515 

Lys Val Arg Gin 
530 

Gin Lys Gly His 
545 

Glu Glu Glu Gly 
Thr Leu His Ser 



55 

Leu Arg He Leu 
70 

Leu Ser Ala Leu 
85 

Val Asp Asn Ala 

Ser Leu Ala Arg 
120 

Val Trp Ser Leu 
135 

Leu Glu Arg Leu 
150 

Ala Ser Leu Arg 
165 

Leu Ser He Val 

Pro Ala Phe Met 
200 

Ser Asn Leu Gin 
215 

Glu He Val Arg 
230 

Arg Thr Gly Val 
245 

Lys He Leu Lys 

He Asn He Cys 
280 

Val Gly Ser Leu 
295 

He Tyr Asn Val 
310 

Val Phe He Leu 
325 

Ala Tyr Phe Arg 

Lys Met Asn Glu 
360 

Trp Val Lys Ala 
375 

Arg Arg He Leu 
390 

Val Ala Pro He 
405 

His Met Thr Leu 

Val Thr Val Phe 
440 

Ser Val Lys Ser 
455 

Ser Leu Phe Leu 
470 

Ser Pro His He 
485 

Ser Ser His Ser 

Lys Lys Asp Lys 
520 

Leu Gin Arg Thr 
535 

Leu Leu Leu Asp 
550 

Lys His He His 
565 

He Asp Leu Glu 



60 

Asp Glu Glu His 
75 

Lys Pro He Arg 
90 

Gly Leu Phe Ser 
105 

Val Ala His Lys 

Ser Lys His Glu 
140 

Trp Gin Glu Glu 
155 

Arg Val Val Trp 
170 

Cys Leu Met He 
185 

Val Lys His Leu 

Tyr Ser Leu Leu 
220 

Ser Trp Ser Leu 
235 

Arg Leu Arg Gly 
250 

Leu Lys Asn He 
265 

Ser Asn Asp Gly 

Leu Ala Gly Gly 
300 

He He Leu Gly 
315 

Phe Tyr Pro Ala 
330 

Arg Lys Cys Val 
345 

Val Leu Thr Tyr 

Phe Ser Gin Ser 
380 

Glu Lys Ala Gly 
395 

Val Val Val He 
410 

Gly Phe Asp Leu 
425 

Asn Ser Met Thr 

Leu Ser Glu Ala 
460 

Met Glu Glu Val 
475 

Lys He Glu Met 
490 

Ser He Gin Asn 
505 

Arg Ala Ser Arg 

Glu His Gin Ala 
540 

Ser Asp Glu Arg 
555 

Leu Gly His Leu 
570 

He Gin Glu Gly 



Pro Lys Gly Lys 

80 

Thr Thr Ser Lys 
95 

Cys Met Thr Phe 
110 

Lys Gly Glu Leu 
125 

Ser Ser Asp Val 

Leu Asn Glu Val 
160 

He Phe Cys Arg 
175 

Thr Gin Leu Ala 
190 

Leu Glu Tyr Thr 
205 

Leu Val Leu Gly 

Ala Leu Thr Trp 
240 

Ala He Leu Thr 
255 

Lys Glu Lys Ser 
270 

Gin Arg Met Phe 
285 

Pro Val Val Ala 

Pro Thr Gly Phe 
320 

Met Met Phe Ala 

335 

Ala Ala Thr Asp 
350 

He Lys Phe He 
365 

Val Gin Lys He 

Tyr Phe Gin Gly 
400 

Ala Ser Val Val 
415 

Thr Ala Ala Gin 
430 

Phe Ala Leu Lys 
445 

Ser Val Ala Val 

His Met He Lys 

480 

Lys Asn Ala Thr 
495 

Ser Pro Lys Leu 
510 

Gly Lys Lys Glu 
525 

Val Leu Ala Glu 

Pro Ser Pro Glu 
560 

A rg L eu Gin A r g 
575 

Lys Leu Val Gly 
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580 585 590 

lie Cys Gly Ser Val Gly Ser Gly Lys Thr Ser Leu lie Ser Ala lie 

595 600 605 

Leu Gly Gin Met Thr Leu Leu Glu Gly Ser lie Ala He Ser Gly Thr 

610 615 620 

Phe Ala Tyr Val Ala Gin Gin Ala Trp He Leu Asn Ala Thr Leu Arg 
625 630 635 640 

Asp Asn He Leu Phe Gly Lys Glu Tyr Asp Glu Glu Arg Tyr Asn Ser 

645 650 655 

Val Leu Asn Ser Cys Cys Leu Arg Pro Asp Leu Ala He Leu Pro Ser 

660 665 670 

Ser Asp Leu Thr Glu He Gly Glu Arg Gly Ala Asn Leu Ser Gly Gly 

675 680 685 

Gin Arg Gin Arg He Ser Leu Ala Arg Ala Leu Tyr Ser Asp Arg Ser 

690 695 700 

He Tyr He Leu Asp Asp Pro Leu Ser Ala Leu Asp Ala His Val Gly 
705 710 715 720 

Asn His He Phe Asn Ser Ala He Arg Lys His Leu Lys Ser Lys Thr 

725 730 735 

Val Leu Phe Val Thr His Gin Leu Gin Tyr Leu Val Asp Cys Asp Glu 

740 745 750 

Val He Phe Met Lys Glu Gly Cys He Thr Glu Arg Gly Thr His Glu 

755 760 765 

Glu Leu Met Asn Leu Asn Gly Asp Tyr Ala Thr He Phe Asn Asn Leu 

770 775 780 

Leu Leu Gly Glu Thr Pro Pro Val Glu He Asn Ser Lys Lys Glu Thr 
785 790 795 800 

Ser Gly Ser Gin Lys Lys Ser Gin Asp Lys Gly Pro Lys Thr Gly Ser 

805 810 815 

Val Lys Lys Glu Lys Ala Val Lys Pro Glu Glu Gly Gin Leu Val Gin 

820 825 830 

Leu Glu Glu Lys Gly Gin Gly Ser Val Pro Trp Ser Val Tyr Gly Val 

835 840 845 

Tyr He Gin Ala Ala Gly Gly Pro Leu Ala Phe Leu Val He Met Ala 

850 855 860 

Leu Phe Met Leu Asn Val Gly Ser Thr Ala Phe Ser Thr Trp Trp Leu 
865 870 875 880 

Ser Tyr Trp He Lys Gin Gly Ser Gly Asn Thr Thr Val Thr Arg Gly 

885 890 895 

Asn Glu Thr Ser Val Ser Asp Ser Met Lys Asp Asn Pro His Met Gin 

900 905 910 

Tyr Tyr Ala Ser He Tyr Ala Leu Ser Met Ala Val Met Leu He Leu 

915 920 925 

Lys Ala He Arg Gly Val Val Phe Val Lys Gly Thr Leu Arg Ala Ser 

930 935 940 

Ser Arg Leu His Asp Glu Leu Phe Arg Arg He Leu Arg Ser Pro Met 
945 950 955 960 

Lys Phe Phe Asp Thr Thr Pro Thr Gly Arg He Leu Asn Arg Phe Ser 

965 970 975 

Lys Asp Met Asp Glu Val Asp Val Arg Leu Pro Phe Gin Ala Glu Met 

980 985 990 

Phe He Gin Asn Val He Leu Val Phe Phe Cys Val Gly Met He Ala 

995 1000 1005 

Gly Val Phe Pro Trp Phe Leu Val Ala Val Gly Pro Leu Val He Leu 

1010 1015 1020 

Phe Ser Val Leu His He Val Ser Arg Val Leu He Arg Glu Leu Lys 
1025 1030 1035 1040 

Arg Leu Asp Asn He Thr Gin Ser Pro Phe Leu Ser His He Thr Ser 

1045 1050 1055 

Ser He Gin Gly Leu Ala Thr He His Ala Tyr Asn Lys Gly Gin Glu 

1060 1065 1070 

Phe Leu His Arg Tyr Gin Glu Leu Leu Asp Asp Asn Gin Ala Pro Phe 

1075 1080 1085 

Phe Leu Phe Thr Cys Ala Met Arg Trp Leu Ala Val Arg Leu Asp Leu 

1090 1095 1100 

He Ser He Ala Leu He Thr Thr Thr Gly Leu Met He Val Leu Met 
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1105 1110 1115 1120 

His Gly Gin lie Pro Pro Ala Tyr Ala Gly Leu Ala lie Ser Tyr Ala 

1125 1130 1135 

Val Gin Leu Thr Gly Leu Phe Gin Phe Thr Val Arg Leu Ala Ser Glu 

1140 1145 1150 

Thr Glu Ala Arg Phe Thr Ser Val Glu Arg He Asn His Tyr He Lys 

1155 1160 1165 

Thr Leu Ser Leu Glu Ala Pro Ala Arg He Lys Asn Lys Ala Pro Ser 

1170 1175 1180 

Pro Asp Trp Pro Gin Glu Gly Glu Val Thr Phe Glu Asn Ala Glu Met 
1185 1190 1195 1200 

Arg Tyr Arg Glu Asn Leu Pro Leu Val Leu Lys Lys Val Ser Phe Thr 

1205 1210 1215 

He Lys Pro Lys Glu Lys He Gly He Val Gly Arg Thr Gly Ser Gly 

1220 1225 1230 

Lys Ser Ser Leu Gly Met Ala Leu Phe Arg Leu Val Glu Leu Ser Gly 

1235 1240 1245 

Gly Cys He Lys He Asp Gly Val Arg He Ser Asp He Gly Leu Aid 

1250 1255 1260 

Asp Leu Arg Ser Lys Leu Ser He He Pro Gin Glu Pro Val Leu Phe 
1265 1270 1275 1280 

Ser Gly Thr Val Arg Ser Asn Leu Asp Pro Phe Asn Gin Tyr Thr Glu 

1285 1290 1295 

Asp Gin He Trp Asp Ala Leu Glu Arg Thr His Met Lys Glu Cys He 

1300 1305 1310 

Ala Gin Leu Pro Leu Lys Leu Glu Ser Glu Val Met Glu Asn Gly Asp 

1315 1320 1325 

Asn Phe Ser Val Gly Glu Arg Gin Leu Leu Cys He Ala Arg Ala Leu 

1330 1335 1340 

Leu Arg His Cys Lys He Leu He Leu Asp Glu Ala Thr Ala Ala Met 
1345 1350 1355 1360 

Asp Thr Glu Thr Asp Leu Leu He Gin Glu Thr He Arg Glu Ala Phe 

1365 1370 1375 

Ala Asp Cys Thr Met Leu Thr He Ala His Arg Leu His Thr Val Leu 

1380 1385 1390 

Gly Ser Asp Arg He Met Val Leu Ala Gin Gly Gin Val Val Gin Phe 

1395 1400 1405 

Asp Thr Pro Ser Val Leu Leu Ser Asn Asp Ser Ser Arg Phe Tyr Ala 

1410 1415 1420 

Met Phe Ala Ala Ala Glu Asn Lys Val Ala Val Lys Gly 
1425 1-430 1435 



<210> 5 
<211> 5079 
<212> DNA 

<213> Homo sapiens 



<400> 5 

ccccatggac gccctgtgcg gttccgggga gctcggctcc aagttctggg actccaacct 60 

gtctgtgcac acagaaaacc cggacctcac tccctgcttc cagaactccc tgctggcctg 120 

ggtgccctgc atctacctgt gggtcgccct gccctgctac ttgctctacc tgcggcacca 180 

ttgtcgtggc tacatcatcc tctcccacct gtccaagctc aagatggtcc tgggtgtcct 240 

gctgtggtgc gtctcctggg cggacctttt ttactccttc catggcctgg tccatggccg 300 

ggcccctgcc cctgttttct ttgtcacccc cttggtggtg ggggtcacca tgctgctggc 360 

caccctgctg atacagtatg agcggctgca gggcgtacag tcttcggggg tcctcattat 420 

cttctggttc ctgtgtgtgg tctgcgccat cgtcccattc cgctccaaga tccttttagc 480 

caaggcagag ggtgagatct cagacccctt ccgcttcacc accttctaca tccactttgc 540 

cctggtactc tctgccctca tcttggcctg cttcagggag aaacctccat ttttctccgc 600 

aaagaatgtc gaccctaacc cctaccctga gaccagcgct ggctttctct cccgcctgtt 660 

tttctggtgg ttcacaaaga tggccatcta tggctaccgg catcccctgg aggagaagga 720 

cctctggtcc ctaaaggaag aggacagatc ccagatggtg gtgcagcagc tgctggaggc 780 

atggaggaag caggaaaagc agacggcacg acacaaggct tcagcagcac ctgggaaaaa 84 0 

tgcctccggc gaggacgagg tgctgctggg tgcccggccc aggccccgga agccctcctt 900 

cctgaaggcc ctgctggcca ccttcggctc cagcttcctc atcagtgcct gcttcaagct 960 

tatccaggac ctgctctcct tcatcaatcc acagctgctc agcatcctga tcaggtttat 1020 

ctccaacccc atggccccct cctggtgggg cttcctggtg gctgggctga tgttcctgtg 1080 
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ctccatgatg cagtcgctga tcttacaaca ctattaccac tacatctttg tgaccggggt 1140 

gaagtttcgt actgggatca tgggtgtcat ctacaggaag gctctggtta tcaccaactc 1200 

agtcaaacgt gcgtccactg tgggggaaat tgtcaacctc atgtcagtgg atgcccagcg 1260 

cttcatggac cttgccccct tcctcaatct gctgtggtca gcacccctgc agatcatcct 1320 

ggcgatctac ttcctctggc agaacctagg tccctctgtc ctggctggag tcgctttcat 1380 

ggtcttgctg attccactca acggagctgt ggccgtgaag atgcgcgcct tccaggtaaa 1440 

gcaaatgaaa ttgaaggact cgcgcatcaa gctgatgagt gagatcctga acggcatcaa 1500 

ggtgctgaag ctgtacgcct gggagcccag cttcctgaag caggtggagg gcatcaggca 1560 

gggtgagctc cagctgctgc gcacggcggc ctacctccac accacaacca ccttcacctg 1620 

gatgtgcagc cccttcctgg tgaccctgat caccctctgg gtgtacgtgt acgtggaccc 1680 

aaacaatgtg ctggacgccg agaaggcctt tgtgtctgtg tccttgttta atatcttaag 1740 

acttcccctc aacatgctgc cccagttaat cagcaacctg actcaggcca gtgtgtctct 1800 

gaaacggatc cagcaattcc tgagccaaga ggaacttgac ccccagagtg tggaaagaaa 1860 

gaccatctcc ccaggctatg ccatcaccat acacagtggc accttcacct gggcccagga 1920 

cctgcccccc actctgcaca gcctagacat ccaggtcccg aaaggggcac tggtggccgt 1980 

ggtggggcct gtgggctgtg ggaagtcctc cctggtgtct gccctgctgg gagagatgga 2040 

gaagctagaa ggcaaagtgc acatgaaggg ctccgtggcc tatgtgcccc agcaggcatg 2100 

gatccagaac tgcactcttc aggaaaacgt gcttttcggc aaagccctga accccaagcg 2160 

ctaccagcag actctggagg cctgtgcctt gctagctgac ctggagatgc tgcctggtgg 2220 

ggatcagaca gagattggag agaagggcat taacctgtct gggggccagc ggcagcgggt 2280 

cagtctggct cgagctgttt acagtgatgc cgatattttc ttgctggatg acccactgtc 2340 

cgcggtggac tctcatgtgg ccaagcacat ctttgaccac gtcatcgggc cagaaggcgt 2400 

gctggcaggc aagacgcgag tgctggtgac gcacggcatt agcttcctgc cccagacaga 2460 

cttcatcatt gtgctagctg atggacaggt gtctgagatg ggcccgtacc cagccctgct 2520 

gcagcgcaac ggctcctttg ccaactttct ctgcaactat gcccccgatg aggaccaagg 2580 

gcacctggag gacagctgga ccgcgttgga aggtgcagag gataaggagg cactgctgat 2 640 

tgaagacaca ctcagcaacc acacggatct gacagacaat gatccagtca cctatgtggt 2700 

ccagaagcag tttatgagac agctgagtgc cctgtcctca gatggggagg gacagggtcg 2760 

gcctgtaccc cggaggcacc tgggtccatc agagaaggtg caggtgacag aggcgaaggc 282 0 

agatggggca ctgacccagg aggagaaage agccattggc actgtggagc tcagtgtgtt 2880 

ctgggattat gccaaggccg tggggctctg taccacgctg gccatctgtc tcctgtatgt 2940 

gggtcaaagt gcggctgcca ttggagccaa tgtgtggctc agtgcctgga caaatgatgc 3 000 

catggcagac agtagacaga acaacacttc cctgaggctg ggcgtctatg ctgctttagg 3060 

aattctgcaa gggttcttgg tgatgctggc agccatggcc atggcagcgg gtggcatcca 3120 

ggctgcccgt gtgttgcacc aggcactgct gcacaacaag atacgctcgc cacagtcctt 3180 

ctttgacacc acaccatcag gccgcatcct gaactgcttc tccaaggaca tctatgtcgt 3240 

tgatgaggtt ctggcccctg tcatcctcat gctgctcaat tccttcttca acgccatctc 3300 

cactcttgtg gtcatcatgg ccagcacgcc gctcttcact gtggtcatcc tgcccctggc 3360 

tgtgctctac accttagtgc agcgcttcta tgcagccaca tcacggcaac tgaagcggct 3420 

ggaatcagtc agccgctcac ctatctactc ccacttttcg gagacagtga ctggtgccag 3480 

tgtcatccgg gcctacaacc gcagccggga ttttgagatc atcagtgata ctaaggtgga 3 540 

tgccaaccag agaagctgct acccctacat catctccaac cggtggctga gcatcggagt 3600 

ggagttcgtg gggaactgcg tggtgctctt tgctgcacta tttgccgtca tcgggaggag 3 660 

cagcctgaac ccggggctgg tgggcctttc tgtgtcctac tccttgcagg tgacatttgc 3720 

tctgaactgg atgatacgaa tgatgtcaga tttggaatct aacatcgtgg ctgtggagag 3780 

ggtcaaggag tactccaaga cagagacaga ggcgccctgg gtggtggaag gcagccgccc 3 84 0 

tcccgaaggt tggcccccac gtggggaggt ggagttccgg aattattctg tgcgctaccg 3900 

gccgggccta gacctggtgc tgagagacct gagtctgcat gtgcacggtg gcgagaaggt 3 9 60 

ggggatcgtg ggccgcactg gggctggcaa gtcttccatg accctttgcc tgttccgcat 4020 

cctggaggcg gcaaagggtg aaatccgcat tgatggcctc aatgtggcag acatcggcct 4080 

ccatgacctg cgctctcagc tgaccatcat cccgcaggac cccatcctgt tctcggggac 4140 

cctgcgcatg aacctggacc ccttcggcag ctactcagag gaggacattt ggtgggcttt 4200 

ggagctgtcc cacctgcaca cgtttgtgag ctcccagccg gcaggcctgg acttccagtg 4260 

ctcagagggc ggggagaatc tcagcgtggg ccagaggcag ctcgtgtgcc tggcccgagc 4320 

cctgctccgc aagagccgca tcctggtttt agacgaggcc acagctgcca tcgacctgga 4380 

gactgacaac ctcatccagg ctaccatccg cacccagttt gatacctgca ctgtcctgac 4440 

catcgcacac cggcttaaca ctatcatgga ctacaccagg gtcctggtcc tggacaaagg 4500 

agtagtagct gaatttgatt ctccagccaa cctcattgca gctagaggca tcttctacgg 4560 

gatggccaga gatgctggac ttgcctaaaa tatattcctg agatttcctc ctggcctttc 4 620 

ctggttttca tcaggaagga aatgacacca aatatgtccg cagaatggac ttgatagcaa 4 680 

acactggggg caccttaaga ttttgcacct gtaaagtgcc ttacagggta actgtgctga 4740 

atgctttaga tgaggaaatg atccccaagt ggtgaatgac acgcctaagg tcacagctag 4800 

tttgagccag ttagactagt ccccggtctc ccgattccca actgagtgtt atttgcacac 4860 

tgcactgttt tcaaataacg attttatgaa atgacctctg tcctccctct gatttttcat 4920 

attttctaaa gtttcgt ttc tgttttttaa taaaaagctt tttcctcctg gaacagaaga 4980 

cagctgctgg gtcaggccac ccctaggaac tcagtcctgt actctggggt gctgcctgaa 5040 
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tccattaaaa atgggagtac tgatgaaata aaactacag 5079 



<210> 6 

<211> 1527 

<212> PRT 

<213> Homo 

<400> 6 
Met Asp Ala Leu 
1 

Ser Asn Leu , Ser 
20 

Gin Asn Ser Leu 
35 

Leu Pro Cys Tyr 
50 

lie Leu Ser His 
65 

Trp Cys Val Ser 

His Gly Arg Ala 
100 

Gly Val Thr Met 
115 

Gin Gly Val Gin 
130 

Val Val Cys Ala 
145 

Ala Glu Gly Glu 

His Phe Ala Leu 
180 

Lys Pro Pro Phe 
195 

Glu Thr Ser Val 
210 

Lys Met Ala lie 
225 

Trp Ser Leu Lys 

Leu Glu Ala Trp 
260 

Ser Ala Ala Pro 
275 

Gly Ala Arg Pro 
290 

Ala Thr Phe Gly 
305 

Gin Asp Leu Leu 

Arg Phe He Ser 
340 

Ala Gly Leu Met 
355 

His Tyr Tyr His 
370 

He Met Gly Val 
385 

Lys Arg Ala Ser 

Ala Gin Arg Phe 
420 

Ala Pro Leu Gin 
435 



sapiens 



Cys Gly Ser Gly 
5 

Val His Thr Glu 

Leu Ala Trp Val 
40 

Leu Leu Tyr Leu 
55 

Leu Ser Lys Leu 
70 

Trp Ala Asp Leu 
85 

Pro Ala Pro Val 

Leu Leu Ala Thr 
120 

Ser Ser Gly Val 
135 

He Val Pro Phe 
150 

He Ser Asp Pro 
165 

Val Leu Ser Ala 

Phe Ser Ala Lys 
200 

Gly Phe Leu Ser 
215 

Tyr Gly Tyr Arg 
230 

Glu Glu Asp Arg 
245 

Arg Lys Gin Glu 

Gly Lys Asn Ala 
280 

Arg Pro Arg Lys 
295 

Ser Ser Phe Leu 
310 

Ser Phe He Asn 
325 

Asn Pro Met Ala 

Phe Leu Cys Ser 
360 

Tyr He Phe Val 
375 

He Tyr Arg Lys 
390 

Thr Val Gly Glu 
405 

Met Asp Leu Ala 

He He Leu Ala 
440 



Glu Leu Gly Ser 
10 

Asn Pro Asp Leu 
25 

Pro Cys He Tyr 

Arg His His Cys 
60 

Lys Met Val Leu 
75 

Phe Tyr Ser Phe 
90 

Phe Phe Val Thr 
105 

Leu Leu He Gin 

Leu He He Phe 
140 

Arg Ser Lys He 
155 

Phe Arg Phe Thr 
170 

Leu He Leu Ala 
185 

Asn Val Asp Pro 

Arg Leu Phe Phe 
220 

His Pro Leu Glu 
235 

Ser Gin Met Val 
250 

Lys Gin Thr Ala 
265 

Ser Gly Glu Asp 

Pro Ser Phe Leu 
300 

He Ser Ala Cys 
315 

Pro Gin Leu Leu 
330 

Pro Ser Trp Trp 
345 

Met Met Gin Ser 

Thr Gly Val Lys 
380 

Ala Leu Val He 
395 

He Val Asn Leu 
410 

Pro Phe Leu Asn 
425 

He Tyr Phe Leu 



Lys Phe Trp Asp 
15 

Thr Pro Cys Phe 
30 

Leu Trp Val Ala 
45 

Arg Gly Tyr He 

Gly Val Leu Leu 
80 

His Gly Leu Val 
95 

Pro Leu Val Val 
110 

Tyr Glu Arg Leu 
125 

Trp Phe Leu Cys 

Leu Leu Ala Lys 
160 

Thr Phe Tyr He 
175 

Cys Phe Arg Glu 
190 

Asn Pro Tyr Pro 

205 

Trp Trp Phe Thr 

Glu Lys Asp Leu 
240 

Val Gin Gin Leu 
255 

Arg His Lys Ala 
270 

Glu Val Leu Leu 
285 

Lys Ala Leu Leu 

Phe Lys Leu lie 
320 

Ser He Leu He 
335 

Gly Phe Leu Val 
350 

Leu He Leu Gin 
365 

Phe Arg Thr Gly 

Thr Asn Ser Val 
400 

Met Ser Val Asp 
415 

Leu Leu Trp Ser 
430 

Trp Gin Asn Leu 
445 
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Gly Pro Ser Val 
450 

Leu Asn Gly Ala 
465 

Met Lys Leu Lys 

Gly lie Lys Val 
500 

Gin Val Glu Gly 
515 

Ala Tyr Leu His 
530 

Leu Val Thr Leu 
545 

Asn Val Leu Asp 

He Leu Arg Leu 
580 

Thr Gin Ala Ser 
595 

Glu Glu Leu Asp 
610 

Tyr Ala He Thr 
625 

Pro Pro Thr Leu 

Val Ala Val Val 
660 

Ala Leu Leu Gly 
675 

Gly Ser Val Ala 
690 

Leu Gin Glu Asn 
705 

Gin Gin Thr Leu 

Pro Gly Gly Asp 
740 

Gly Gly Gin Arg 
755 

Ala Asp He Phe 
770 

Val Ala Lys His 
785 

Ala Gly Lys Thr 

Gin Thr Asp Phe 
820 

Gly Pro Tyr Pro 
835 

Leu Cys Asn Tyr 
850 

Trp Thr Ala Leu 
865 

Asp Thr Leu Ser 

Tyr Val Val Gin 
900 

Asp Gly Glu Gly 

915 

Ser Glu Lys Val 
930 

Gin Glu Glu Lys 
945 

Asp Tyr Ala Lys 



Leu Ala Gly Val 
455 

Val Ala Val Lys 
470 

Asp Ser Arg He 
485 

Leu Lys Leu Tyr 

He Arg Gin Gly 
520 

Thr Thr Thr Thr 
535 

He Thr Leu Trp 
550 

Ala Glu Lys Ala 
565 

Pro Leu Asn Met 

Val Ser Leu Lys 
600 

Pro Gin Ser Val 
615 

He His Ser Gly 
630 

His Ser Leu Asp 
645 

Gly Pro Val Gly 

Glu Met Glu Lys 
680 

Tyr Val Pro Gin 
695 

Val Leu Phe Gly 
710 

Glu Ala Cys Ala 
725 

Gin Thr Glu He 

Gin Arg Val Ser 
760 

Leu Leu Asp Asp 
775 

He Phe Asp His 
790 

Arg Val Leu Val 
805 

He He Val Leu 

Ala Leu Leu Gin 
840 

Ala Pro Asp Glu 
855 

Glu Gly Ala Glu 
87 0 

Asn His Thr Asp 
885 

Lys Gin Phe Met 

Gin Gly Arg Pro 
920 

Gin Val Thr Glu 
935 

Ala Ala He Gly 
950 

Ala Val Gly Leu 
965 



Ala Phe Met Val 
460 

Met Arg Ala Phe 
475 

Lys Leu Met Ser 
490 

Ala Trp Glu Pro 
505 

Glu Leu Gin Leu 

Phe Thr Trp Met 
540 

Val Tyr Val Tyr 
555 

Phe Val Ser Val 
570 

Leu Pro Gin Leu 
585 

Arg He Gin Gin 

Glu Arg Lys Thr 
620 

Thr Phe Thr Trp 
635 

He Gin Val Pro 
650 

Cys Gly Lys Ser 
665 

Leu Glu Gly Lys 

Gin Ala Trp He 
700 

Lys Ala Leu Asn 
715 

Leu Leu Ala Asp 
730 

Gly Glu Lys Gly 
745 

Leu Ala Arg Ala 

Pro Leu Ser Ala 
780 

Val He Gly Pro 
795 

Thr His Gly He 
810 

Ala Asp Gly Gin 
825 

Arg Asn Gly Ser 

Asp Gin Gly His 
860 

Asp Lys Glu Ala 
875 

Leu Thr Asp Asn 
890 

Arg Gin Leu Ser 
905 

Val Pro Arg Arg 

Ala Lys Ala Asp 
940 

Thr Val Glu Leu 
955 

Cys Thr Thr Leu 
970 



Leu Leu He Pro 

Gin Val Lys Gin 
480 

Glu He Leu Asn 
495 

Ser Phe Leu Lys 
510 

Leu Arg Thr Ala 
525 

Cys Ser Pro Phe 

Val Asp Pro Asn 
560 

Ser Leu Phe Asn 
575 

He Ser Asn Leu 
590 

Phe Leu Ser Gin 
605 

He Ser Pro Gly 

Ala Gin Asp Leu 
640 

Lys Gly Ala Leu 
655 

Ser Leu Val Ser 
670 

Val His Met Lys 
685 

Gin Asn Cys Thr 

Pro Lys Arg Tyr 
720 

Leu Glu Met Leu 

735 

He Asn Leu Ser 
750 

Val Tyr Ser Asp 
765 

Val Asp Ser His 

Glu Gly Val Leu 
800 

Ser Phe Leu Pro 
815 

Val Ser Glu Met 
830 

Phe Ala Asn Phe 
845 

Leu Glu Asp Ser 

Leu Leu He Glu 
880 

Asp Pro Val Thr 
895 

Ala Leu Ser Ser 
910 

His Leu Gly Pro 
925 

Gly Ala Leu Thr 

Ser Val Phe Trp 
960 

Ala He Cys Leu 
975 
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Leu Tyr Val Gly Gin Ser Ala Ala Ala lie Gly Ala Asn Val Trp Leu 

980 985 990 

Ser Ala Trp Thr Asn Asp Ala Met Ala Asp Ser Arg Gin Asn Asn Thr 

995 1000 1005 

Ser Leu Arg Leu Gly Val Tyr Ala Ala Leu Gly lie Leu Gin Gly Phe 

1010 1015 1020 

Leu Val Met Leu Ala Ala Met Ala Met Ala Ala Gly Gly lie Gin Ala 
1025 1030 1035 1040 

Ala Arg Val Leu His Gin Ala Leu Leu His Asn Lys He Arg Ser Pro 

1045 1050 1055 

Gin Ser Phe Phe Asp Thr Thr Pro Ser Gly Arg He Leu Asn Cys Phe 

1060 1065 1070 

Ser Lys Asp He Tyr Val Val Asp Glu Val Leu Ala Pro Val He Leu 

1075 1080 1085 

Met Leu Leu Asn Ser Phe Phe Asn Ala He Ser Thr Leu Val Val He 

1090 1095 1100 

Met Ala Ser Thr Pro Leu Phe Thr Val Val He Leu Pro Leu Ala Val 
1105 1110 1115 1120 

Leu Tyr Thr Leu Val Gin Arg Phe Tyr Ala Ala Thr Ser Arg Gin Leu 

1125 1130 1135 

Lys Arg Leu Glu Ser Val Ser Arg Ser Pro He Tyr Ser His Phe Ser 

1140 1145 1150 

Glu Thr Val Thr Gly Ala Ser Val He Arg Ala Tyr Asn Arg Ser Arg 

1155 1160 1165 

Asp Phe Glu He He S er Asp Thr Lys Val Asp Ala Asn Gin Arg Ser 

1170 1175 1180 

Cys Tyr Pro Tyr He He Ser Asn Arg Trp Leu Ser He Gly Val Glu 
1185 1190 1195 1200 

Phe Val Gly Asn Cys Val Val Leu Phe Ala Ala Leu Phe Ala Val He 

1205 1210 1215 

Gly Arg Ser Ser Leu Asn Pro Gly Leu Val Gly Leu Ser Val Ser Tyr 

1220 1225 1230 

Ser Leu Gin Val Thr Phe Ala Leu Asn Trp Met He Arg Met Met Ser 

1235 1240 1245 

Asp Leu Glu Ser Asn He Val Ala Val Glu Arg Val Lys Glu Tyr Ser 

1250 1255 1260 

Lys Thr Glu Thr Glu Ala Pro Trp Val Val Glu Gly Ser Arg Pro Pro 
1265 1270 1275 1280 

Glu Gly Trp Pro Pro Arg Gly Glu Val Glu Phe Arg Asn Tyr Ser Val 

1285 1290 1295 

Arg Tyr Arg Pro Gly Leu Asp Leu Val Leu Arg Asp Leu Ser Leu His 

1300 1305 1310 

Val His Gly Gly Glu Lys Val Gly He Val Gly Arg Thr Gly Ala Gly 

1315 1320 1325 

Lys Ser Ser Met Thr Leu Cys Leu Phe Arg He Leu Glu Ala Ala Lys 

1330 1335 1340 

Gly Glu He Arg He Asp Gly Leu Asn Val Ala Asp He Gly Leu His 
1345 1350 1355 1360 

Asp Leu Arg Ser Gin Leu Thr He He Pro Gin Asp Pro He Leu Phe 

1365 1370 1375 

Ser Gly Thr Leu Arg Met Asn Leu Asp Pro Phe Gly Ser Tyr Ser Glu 

1380 1385 1390 

Glu Asp He Trp Trp Ala Leu Glu Leu Ser His Leu His Thr Phe Val 

1395 1400 1405 

Ser Ser Gin Pro Ala Gly Leu Asp Phe Gin Cys Ser Glu Gly Gly Glu 

1410 1415 1420 

Asn Leu Ser Val Gly Gin Arg Gin Leu Val Cys Leu Ala Arg Ala Leu 
1425 1430 1435 1440 

Leu Arg Lys Ser Arg He Leu Val Leu Asp Glu Ala Thr Ala Ala He 

1445 1450 1455 

Asp Leu Glu Thr Asp Asn Leu He Gin Ala Thr He Arg Thr Gin Phe 

1460 1465 1470 

Asp Thr Cys Thr Val Leu Thr He Ala His Arg Leu Asn Thr He Met 

1475 1480 1485 

Asp Tyr Thr Arg Val Leu Val Leu Asp Lys Gly Val Val Ala Glu Phe 
1490 1495 1500 



SUBSTITUTE SHEET (RULE 26) 



WO 99/49735 



PCT/US99/06644 



14/19 

Asp Ser Pro Ala Asn Leu lie Ala Ala Arg Gly lie Phe Tyr Gly Met 
1505 1510 1515 1520 

Ala Arg Asp Ala Gly Leu Ala 
1525 

<210> 7 

<211> 4509 

<212> DNA 

<213> Homo sapiens 



<400> 7 

atggccgcgc ctgctgagcc ctgcgcgggg cagggggtct ggaaccagac agagcctgaa 60 

cctgccgcca ccagcctgct gagcctgtgc ttcctgagaa cagcaggggt ctgggtaccc 120 

cccatgtacc tctgggtcct tggtcccatc tacctcctct tcatccacca ccatggccgg 180 

ggctacctcc ggatgtcccc actcttcaaa gccaagatgg tgcttggatt cgccctcata 240 

gtcctgtgta cctccagcgt ggctgtcgct ctttggaaaa tccaacaggg aacgcctgag 300 

gccccagaat tcctcattca tcctactgtg tggctcacca cgatgagctt cgcagtgttc 360 

ctgattcaca ccgagaggaa aaagggagtc cagtcatctg gagtgctgtt tggttactgg 420 

cttctctgct ttgtcttgcc agctaccaac gctgcccagc aggcctccgg agcgggcttc 480 

cagagcgacc ctgtccgcca cctgtccacc tacctatgcc tgtctctggt ggtggcacag 540 

tttgtgctgt cctgcctggc ggatcaaccc cccttcttcc ctgaagaccc ccagcagtct 600 

aacccctgtc cagagactgg ggcagccttc ccctccaaag ccacgttctg gtgggtttct 660 

ggcctggtct ggaggggata caggaggcca ctgagaccaa aagacctctg gtcgcttggg 720 

agagaaaact cctcagaaga acttgtttcc cggcttgaaa aggagtggat gaggaaccgc 780 

agtgcagccc ggaggcacaa caaggcaata gcatttaaaa ggaaaggcgg cagtggcatg 840 

aaggctccag agaccgagcc cttcctacgg caagaaggga gccagtggcg cccactgctg 900 

aaggccatct ggcaggtgtt ccattctacc ttcctcctgg ggaccctcag cctcatcatc 960 

agtgatgtct tcaggttcac tgtccccaag ctgctcagcc ttttcctgga gtttat tggt 1020 

gatcccaagc ctccagcctg gaagggctac ctcctcgccg tgctgatgtt cctctcagcc 1080 

tgcctgcaaa cgctgtttga gcagcagaac atgtacaggc tcaaggtgcc gcagatgagg 1140 

ttgcggtcgg ccatcactgg cctggtgtac agaaaggtcc tggctctgtc cagcggctcc 1200 

agaaaggcca gtgcggtggg tgatgtggtc aatctggtgt ccgtggacgt gcagcggctg 12 60 

accgagagcg tcctctacct caacgggctg tggctgcctc tcgtctggat cgtggtctgc 1320 

ttcgtctatc tctggcagct cctggggccc tccgccctca ctgccatcgc tgtcttcctg 1380 

agcctcctcc ctctgaattt cttcatctcc aagaaaagga accaccatca ggaggagcaa 1440 

atgaggcaga aggactcacg ggcacggctc accagctcta tcctcaggaa ctcgaagacc 1500 

atcaagttcc atggctggga gggagccttt ctggacagag tcctgggcat ccgaggccag 1560 

gagctgggcg ccttgcggac ctccggcctc ctcttctctg tgtcgctggt gtccttccaa 1620 

gtgtctacat ttctggtcgc actggtggtg tttgctgtcc acactctggt ggccgagaat 1680 

gctatgaatg cagagaaagc ctttgtgact ctcacagttc tcaacatcct caacaaggcc 1740 

caggctttcc tgcccttctc catccactcc ctcgtccagg cccgggtgtc ctttgaccgt 1800 

ctggtcacct tcctctgcct ggaagaagtt gaccctggtg tcgtagactc aagttcctct 1860 

ggaagcgctg ccgggaagga ttgcatcacc atacacagtg ccaccttcgc ctggtcccag 1920 

gaaagccctc cctgcctcca cagaataaac ctcacggtgc cccagggctg tctgctggct 1980 

gttgtcggtc cagtgggggc agggaagtcc tccctgctgt ccgccctcct tggggagctg 2040 

tcaaaggtgg aggggttcgt gagcatcgag ggtgctgtgg cctacgtgcc ccaggaggcc 2100 

tgggtgcaga acacctctgt ggtagagaat gtgtgcttcg ggcaggagct ggacccaccc 2160 

tggcfcggaga gagtactaga agcctgtgcc ctgcagccag atgtggacag cttccctgag 2220 

ggaatccaca cttcaattgg ggagcagggc atgaatctct ccggaggcca gaagcagcgg 2280 

ctgagcctgg cccgggctgt atacagaaag gcagctgtgt acctgctgga tgaccccctg 2340 

gcggccctgg atgcccacgt tggccagcat gtcttcaacc aggtcattgg gcctggtggg 2400 

ctactccagg gaacaacacg gattctcgtg acgcacgcac tccacatcct gccccaggct 24 60 

gattggatca tagtgctggc aaatggggcc atcgcagaga tgggttccta ccaggagctt 2 520 

ctgcagagga agggggccct cgtgtgtctt ctggatcaag ccagacagcc aggagataga 2 580 

ggagaaggag aaacagaacc tgggaccagc accaaggacc ccagaggcac ctctgcaggc 2 640 

aggaggcccg agcttagacg cgagaggtcc atcaagtcag tccctgagaa ggaccgtacc 27 00 

acttcagaag cccagacaga ggttcctctg gatgaccctg acagggcagg atggccagca 2760 

ggaaaggaca gcatccaata cggcagggtg aaggccacag tgcacctggc ctacctgcgt 2820 

gccgtgggca cccccctctg cctctacgca ctcttcctct tcctctgcca gcaagtggcc 2880 

tccttctgcc ggggctactg gctgagcctg tgggcggacg accctgcagt aggtgggcag 2940 

cagacgcagg cagccctgcg tggcgggatc ttcgggctcc tcggctgtct ccaagccatt 3 000 

gggctgtttg cctccatggc tgcggtgctc ctaggtgggg cccgggcatc caggttgctc 3060 

ttccagaggc tcctgtggga tgtggtgcga tctcccatca gcttctttga gcggacaccc 3120 

attggtcacc tgctaaaccg cttctccaag gagacagaca cggttgacgt ggacattcca 3180 

gacaaactcc ggtccctgct gatgtacgcc tttggactcc tggaggtcag cctggtggtg 3240 

gcagtggcta ccccactggc cactgtggcc atcctgccac tgtttctcct ctacgctggg 3300 
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t t tcagagcc 
tcgtctgtct 
cgaacccagg 
agtttcccgc 
ggcctggtgt 
ctcgtgggct 
cgcaactgga 
tggacgccca 
cagggcgggc 
gctgtgcagg 
accggggcag 
ggtgggatct 
aggatcagca 
gacctgctgc 
aaagcct tgg 
gacctgagcg 
cagatcctca 
caggccatgc 
cgctccgtga 
ggcagcccgg 
ggcctggtc 



tgtatgtggt 
gctcccacat 
ccccctttgt 
gactggtggc 
ttgcagccgc 
tctctgtctc 
cagacctaga 
aggaggctcc 
agatcgagtt 
gcgtgtcct t 
ggaagtcctc 
ggatcgacgg 
tcatccccca 
aggagcactc 
tggccagcct 
tgggccagaa 
tcctggacga 
tcgggagctg 
tggactgtgc 
cccagctgct 



tagctcatgc 
ggctgagacg 
ggctcagaac 
tgacaggtgg 
cacgtgtgct 
tgctgccctc 
gaacagcatc 
ctggaggctg 
ccgggac tt t 
caagatccac 
cctggccagt 
ggtccccat t 
ggaccccatc 
ggacgaggct 
gcccggccag 
acagctcctg 
ggctactgct 
gt t tgcacag 

cc gggt tc tg 

ggcccagaag 



cagctgagac 
ttccagggca 
aatgctcgcg 
cttgcggcca 
gtgctgagca 
caggtgaccc 
gtgtcagtgg 
cccacatgtg 
gggctaagat 
gcaggagaga 
gggctgctgc 
gcccacgtgg 
ctgttccctg 
atctgggcag 
ctgcagtaca 
tgtctggcac 
gccgtggacc 
tgcactgtgc 
gtcatggaca 
ggcctgtttt 



gcttggagtc 
gcacagtggt 
tagatgaaag 
atgtggagct 
aagcccacct 
agacactgca 
agcggatgca 
cagctcagcc 
gccgacc tga 
aggtgggcat 
ggctccagga 
ggctgcacac 
gctctctgcg 
ccc tggagac 
agtgtgctga 
gtgcccttct 
c tggcacgga 
tgcccattgc 
aggggcaggt 
acagactggc 



agccagctac 
ccgggcattc 
ccagaggatc 
cc tggggaat 
cagtgctggc 
gtgggttgtt 
ggaetatgcc 
cccctggcct 
gctcccgctg 
cgttggcagg 
ggcagctgag 
actgcgctcc 
gatgaacctc 
ggtgcagctc 
ccgaggcgag 
ccggaagacc 
gctgcagatg 
ccaccgcctg 
ggcagagagc 
ccaggagtca 



3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4509 



<210> 8 

<211> 1503 

<212> PRT 

<213> Homo sapiens 





<400> 


8 


Met 


Ala 


Ala 


Pro 


1 








Thr 


Glu 


Pro 


Glu 








20 


Arg 


Thr 


Ala 


Gly 






35 




Pro 


lie 


Tyr 


Leu 




50 






Met 


Ser 


Pro 


Leu 


65 








Val 


Leu 


Cys 


Thr 


Gly 


Thr 


Pro 


Glu 








100 


Thr 


Thr 


Met 


Ser 






115 




Gly 


Val 


Gin 


Ser 




130 






Val 


Leu 


Pro 


Ala 


145 








Gin 


Ser 


Asp 


Pro 


Val 


Val 


Ala 


Gin 








180 


Phe 


Pro 


Glu 


Asp 






195 




Ala 


Phe 


Pro 


Ser 




210 






Arg 


Gly 


Tyr 


Arg 


225 








Arg 


Glu 


Asn 


Ser 


Met 


Arg 


Asn 


Arg 








260 


Lys 


Arg 


Lys 


Gly 






275 




Leu 


Arg 


Gin 


Glu 




290 







Ala 


Glu 


Pro 


Cys 


5 








Pro 


Ala 


Ala 


Thr 


Val 


Trp 


Val 


Pro 








40 


Leu 


Phe 


lie 


His 






55 




Phe 


Lys 


Ala 


Lys 




70 






Ser 


Ser 


Val 


Ala 


85 








Ala 


Pro 


Glu 


Phe 


Phe 


Ala 


Val 


Phe 








120 


Ser 


Gly 


Val 


Leu 






135 




Thr 


Asn 


Ala 


Ala 




150 






Val 


Arg 


His 


Leu 


165 








Phe 


Val 


Leu 


Ser 


Pro 


Gin 


Gin 


Ser 








200 


Lys 


Ala 


Thr 


Phe 






215 




Arg 


Pro 


Leu 


Arg 




230 






Ser 


Glu 


Glu 


Leu 


245 








Ser 


Ala 


Ala 


Arg 


Gly 


Ser 


Gly 


Met 








280 


Gly 


Ser 


Gin 


Trp 






295 





Ala 


Gly 


Gin 


Gly 




10 






Ser 


Leu 


Leu 


Ser 


25 








Pro 


Met 


Tyr 


Leu 


His 


His 


Gly 


Arg 








60 


Met 


Val 


Leu 


Gly 






75 


Val 


Ala 


Leu 


Trp 




90 






Leu 


lie 


His 


Pro 


105 








Leu 


He 


His 


Thr 


Phe 


Gly 


Tyr 


Trp 








140 


Gin 


Gin 


Ala 


Ser 






155 




Ser 


Thr 


Tyr 


Leu 




170 






Cys 


Leu 


Ala 


Asp 


185 








Asn 


Pro 


Cys 


Pro 


Trp 


Trp 


Val 


Ser 








220 


Pro 


Lys 


Asp 


Leu 






235 




Val 


Ser 


Arg 


Leu 




250 






Arg 


His 


Asn 


Lys 


265 








Lys 


Ala 


Pro 


Glu 


Arg 


Pro 


Leu 


Leu 



300 



Val 


Trp 


Asn 


Gin 






15 




Leu 


Cys 


Phe 


Leu 




30 






Trp 


Val 


Leu 


Gly 


45 








Gly 


Tyr 


Leu 


Arg 


Phe 


Ala 


Leu 


He 








80 


Lys 


lie 


Gin 


Gin 






95 




Thr 


Val 


Trp 


Leu 




110 






Glu 


Arg 


Lys 


Lys 


125 








Leu 


Leu 


Cys 


Phe 


Gly 


Ala 


Gly 


Phe 








160 


Cys 


Leu 


Ser 


Leu 






175 




Gin 


Pro 


Pro 


Phe 




190 






Glu 


Thr 


Gly 


Ala 


205 








Gly 


Leu 


Val 


Trp 


Trp 


Ser 


Leu 


Gly 








240 


Glu 


Lys 


Glu 


Trp 






255 




Ala 


He 


Ala 


Phe 




270 






Thr 


Glu 


Pro 


Phe 


285 








Lys 


Ala 


He 


Trp 
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Gin 


Val 


Phe 


His 


305 








Ser 


Asp 


Val 


Phe 


Glu 


Phe 


He 


Gly 








340 


Ala 


Val 


Leu 


Met 






355 




Gin 


Asn 


Met 


Tyr 




370 






He 


Thr 


Gly 


Leu 


3 85 








Arg 


Lys 


Ala 


Ser 


Val 


Gin 


Arg 


Leu 








420 


Pro 


Leu 


Val 


Trp 






43 5 




Gly 


Pro 


Ser 


Ala 




450 






Leu 


Asn 


Phe 


Phe 


465 








Met 


Arg 


Gin 


Lys 


Asn 


Ser 


Lys 


Thr 








500 


Arg 


Val 


Leu 


Gly 






515 




Gly 


Leu 


Leu 


Phe 




530 






Leu 


Val 


Ala 


Leu 


545 








Ala 


Met 


Asn 


Ala 


Leu 


Asn 


Lys 


Ala 








580 


Gin 


Ala 


Arg 


Val 






595 




Glu 


Val 


Asp 


Pro 




610 






Gly 


Lys 


Asp 


Cys 


625 








Glu 


Ser 


Pro 


Pro 


Cys 


Leu 


Leu 


Ala 








660 


Leu 


Ser 


Ala 


Leu 






675 




He 


Glu 


Gly 


Ala 




690 






Thr 


Ser 


Val 


Val 


705 








Trp 


Leu 


Glu 


Arg 


Ser 


Phe 


Pro 


Glu 








740 


Leu 


Ser 


Gly 


Gly 






755 




Arg 


Lys 


Ala 


Ala 




770 






Ala 


His 


Val 


Gly 


785 








Leu 


Leu 


Gin 


Gly 


Leu 


Pro 


Gin 


Ala 



820 



Ser 


Thr 


Phe 


Leu 




310 






Arg 


Phe 


Thr 


Val 


325 








Asp 


Pro 


Lys 


Pro 


Phe 


Leu 


Ser 


Ala 








3 60 


Arg 


Leu 


Lys 


Val 






375 




Val 


Tyr 


Arg 


Lys 




390 






Ala 


Val 


Gly 


Asp 


4 0 5 








Thr 


Glu 


Ser 


Val 


He 


Val 


Val 


Cys 








440 


Leu 


Thr 


Ala 


He 






455 




He 


Ser 


Lys 


Lys 




470 






Asp 


Ser 


Arg 


Ala 


485 








He 


Lys 


Phe 


His 


lie 


Arg 


Gly 


Gin 








520 


Ser 


Val 


Ser 


Leu 






535 




Val 


Val 


Phe 


Ala 




550 






Glu 


Lys 


Ala 


Phe 


565 








Gin 


Ala 


Phe 


Leu 


Ser 


Phe 


Asp 


Arg 








600 


Gly 


Val 


Val 


Asp 






615 




He 


Thr 


He 


His 




630 






Cys 


Leu 


His 


Arg 


645 








Val 


Val 


Gly 


Pro 


Leu 


Gly 


Glu 


Leu 








680 


Val 


Ala 


Tyr 


Val 






695 




Glu 


Asn 


Val 


Cys 




710 






Val 


Leu 


Glu 


Ala 


725 








Gly 


He 


His 


Thr 


Gin 


Lys 


Gin 


Arg 








760 


Val 


Tyr 


Leu 


Leu 




775 




Gin 


His 


Val 


Phe 




790 






Thr 


Thr 


Arg 


He 


805 








Asp 


Trp 


He 


He 
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Leu 


Gly 


Thr 


Leu 






315 




Pro 


Lys 


Leu 


Leu 




330 






Pro 


Ala 


Trp 


Lys 


345 








Cys 


Leu 


Gin 


Thr 


Pro 


Gin 


Met 


Arg 








380 


Val 


Leu 


Ala 


Leu 






3 9 5 




Val 


Val 


Asn 


Leu 




410 






Leu 


Tyr 


Leu 


Asn 


425 








Phe 


Val 


Tyr 


Leu 


Ala 


Val 


Phe 


Leu 








460 


Arg 


Asn 


His 


His 






475 




Arg 


Leu 


Thr 


Ser 




490 






Gly 


Trp 


Glu 


Gly 


505 








Glu 


Leu 


Gly 


Ala 


Val 


Ser 


Phe 


Gin 








540 


Val 


His 


Thr 


Leu 






555 




Val 


Thr 


Leu 


Thr 




570 






Pro 


Phe 


Ser 


He 


585 








Leu 


Val 


Thr 


Phe 


Ser 


Ser 


Ser 


Ser 








620 


Ser 


Ala 


Thr 


Phe 






635 




He 


Asn 


Leu 


Thr 




650 






Val 


Gly 


Ala 


Gly 


665 








Ser 


Lys 


Val 


Glu 


Pro 


Gin 


Glu 


Ala 








700 


Phe 


Gly 


Gin 


Glu 






715 




Cys 


Ala 


Leu 


Gin 




730 






Ser 


He 


Gly 


Glu 


745 








Leu 


Ser 


Leu 


Ala 


Asp 


Asp 


Pro 


Leu 








780 


Asn 


Gin 


Val 


He 






795 




Leu 


Val 


Thr 


His 




810 






Val 


Leu 


Ala 


Asn 



825 



Ser Leu He He 
320 

Ser Leu Phe Leu 
335 

Gly Tyr Leu Leu 
350 

Leu Phe Glu Gin 
365 

Leu Arg Ser Ala 

Ser Ser Gly Ser 
400 

Val Ser Val Asp 
415 

Gly Leu Trp Leu 
430 

Trp Gin Leu Leu 
445 

Ser Leu Leu Pro 

Gin Glu Glu Gin 
480 

Ser He Leu Arg 
495 

Ala Phe Leu Asp 
510 

Leu Arg Thr Ser 
525 

Val Ser Thr Phe 

Val Ala Glu Asn 
560 

Val Leu Asn He 
57 5 

His Ser Leu Val 
590 

Leu Cys Leu Glu 
605 

Gly Ser Ala Ala 

Ala Trp Ser Gin 
640 

Val Pro Gin Gly 
655 

Lys Ser Ser Leu 
670 

Gly Phe Val Ser 
685 

Trp Val Gin Asn 

Leu Asp Pro Pro 
720 

Pro Asp Val Asp 
735 

Gin Gly Met Asn 
750 

Arg Ala Val Tyr 
765 

Ala Ala Leu Asp 

Gly Pro Gly Gly 
800 

Ala Leu His He 
815 

Gly Ala He Ala 
830 
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Glu Met Gly Ser Tyr Gin Glu Leu Leu Gin Arg Lys Gly Ala Leu Val 

835 840 845 

Cys Leu Leu Asp Gin Ala Arg Gin Pro Gly Asp Arg Gly Glu Gly Glu 

850 855 860 

Thr Glu Pro Gly Thr Ser Thr Lys Asp Pro Arg Gly Thr Ser Ala Gly 
865 870 875 880 

Arg Arg Pro Glu Leu Arg Arg Glu Arg Ser lie Lys Ser Val Pro Glu 

885 890 895 

Lys Asp Arg Thr Thr Ser Glu Ala Gin Thr Glu Val Pro Leu Asp Asp 

900 905 910 

Pro Asp Arg Ala Gly Trp Pro Ala Gly Lys Asp Ser lie Gin Tyr Gly 

915 920 925 

Arg Val Lys Ala Thr Val His Leu Ala Tyr Leu Arg Ala Val Gly Thr 

930 935 940 

Pro Leu Cys Leu Tyr Ala Leu Phe Leu Phe Leu Cys Gin Gin Val Ala 
945 950 955 960 

Ser Phe Cys Arg Gly Tyr Trp Leu Ser Leu Trp Ala Asp Asp Pro Ala 

965 970 975 

Val Gly Gly Gin Gin Thr Gin Ala Ala Leu Arg Gly Gly He Phe Gly 

980 985 990 

Leu Leu Gly Cys Leu Gin Ala lie Gly Leu Phe Ala Ser Met Ala Ala 

995 1000 1005 

Val Leu Leu Gly Gly Ala Arg Ala Ser Arg Leu Leu Phe Gin Arg Leu 

101° 1015 1020 

Leu Trp Asp Val Val Arg Ser Pro He Ser Phe Phe Glu Arg Thr Pro 
1025 . 1030 1035 1040 

He Gly His Leu Leu Asn Arg Phe Ser Lys Glu Thr Asp Thr Val Asp 

1045 1050 1055 

Val Asp He Pro Asp Lys Leu Arg Ser Leu Leu Met Tyr Ala Phe Gly 

1060 1065 1070 

Leu Leu Glu Val Ser Leu Val Val Ala Val Ala Thr Pro Leu Ala Thr 

1075 1080 1085 

Val Ala He Leu Pro Leu Phe Leu Leu Tyr Ala Gly Phe Gin Ser Leu 

1090 1095 HOO 

Tyr Val Val Ser Ser Cys Gin Leu Arg Arg Leu Glu Ser Ala Ser Tyr 
1105 1110 H15 H20 

Ser Ser Val Cys Ser His Met Ala Glu Thr Phe Gin Gly Ser Thr Val 

1125 1130 H35 

Val Arg Ala Phe Arg Thr Gin Ala Pro Phe Val Ala Gin Asn Asn Ala 

1140 H45 H50 

Arg Val Asp Glu Ser Gin Arg He Ser Phe Pro Arg Leu Val Ala Asp 

1155 1160 H65 

Arg Trp Leu Ala Ala Asn Val Glu Leu Leu Gly Asn Gly Leu Val Phe 

H70 H75 H80 

Ala Ala Ala Thr Cys Ala Val Leu Ser Lys Ala His Leu Ser Ala Gly 
1185 1190 H95 1200 

Leu Val Gly Phe Ser Val Ser Ala Ala Leu Gin Val Thr Gin Ala Leu 

1205 1210 1215 

Gin Trp Val Val Arg Asn Trp Thr Asp Leu Glu Asn Ser He Val Ser 

1220 1225 1230 

Val Glu Arg Met Gin Asp Tyr Ala Trp Thr Pro Lys Glu Ala Pro Trp 

1235 1240 1245 

Arg Leu Pro Thr Cys Ala Ala Gin Pro Pro Trp Pro Gin Gly Gly Gin 
1250 1255 1260 

Hf. Glu Phe Arg Asp Phe Gly Leu Ar ^ Ar ^T Pro Glu Leu Pro Leu 

^ 265 1270 1275 1280 

Ala Val Gin Gly Val Ser Leu Lys He His Ala Gly Glu Lys Val Gly 

1285 1290 1295 

He Val Gly Arg Thr Gly Ala Gly Lys Ser Ser Leu Ala Ser Gly Leu 

1300 1305 1310 

Leu Arg Leu Gin Glu Ala Ala Glu Gly Gly He Trp He Asp Gly Val 

1315 1320 1325 

Pro He Ala His Val Gly Leu His Thr Leu Arg Ser Arg He Ser He 

1330 1335 13 40 

He Pro Gin Asp Pro He Leu Phe Pro Gly Ser Leu Arg Met Asn Leu 
1345 1350 1355 i3 6 o 
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Asp Leu Leu Gin Glu His Ser Asp Glu Ala lie Trp Ala Ala Leu Glu 

1365 1370 1375 

Thr Val Gin Leu Lys Ala Leu Val Ala Ser Leu Pro Gly Gin Leu Gin 

1380 1385 1390 

Tyr Lys Cys Ala Asp Arg Gly Glu Asp Leu Ser Val Gly Gin Lys Gin 

1395 1400 1405 

Leu Leu Cys Leu Ala Arg Ala Leu Leu Arg Lys Thr Gin lie Leu He 

1410 1415 1420 

Leu Asp Glu Ala Thr Ala Ala Val Asp Pro Gly Thr Glu Leu Gin Met 
1425 1430 1435 1440 

Gin Ala Met Leu Gly Ser Trp Phe Ala Gin Cys Thr Val Leu Leu He 

1445 1450 1455 

Ala His Arg Leu Arg Ser Val Met Asp Cys Ala Arg Val Leu Val Met 

1460 1465 1470 

Asp Lys Gly Gin Val Ala Glu Ser Gly Ser Pro Ala Gin Leu Leu Ala 

1475 1480 1485 

Gin Lys Gly Leu Phe Tyr Arg Leu Ala Gin Glu Ser Gly Leu Val 
1490 1495 1500 



<210> 9 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> SecpjLQTice source : /note- " synthetic construct" 



<400> 9 
ctdgtdgcdg tdgtdggn 

<210> 10 
<211> 19 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Sequence source : /note= " synthetic construct" 



<400> 10 

atggccgcgc ctgctgagc 19 

<210> 11 

<211> 20 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Sequence source : /note= " synthetic construct" 
<400> 11 

gtctacgaca ccagggtcaa 20 



<210> 12 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Sequence source : /note= " synthetic construct" 



<400> 12 

ctgcctggaa gaagttgacc 2 0 

<210> 13 
<211> 20 
<212> DNA 
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<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 

<400> 13 
ctggaatgtc cacgtcaacc 

<210> 14 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 

<400> 14 
ggagacagac acggttgacg 

<210> 15 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct' 

<400> 15 
gcagaccagg cctgactcc 

<210> 16 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /not e= " synthetic construct 

<400> 16 
rctnavngcn swnarnggnt crtc 

<210> 17 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /not e= " synthetic construct 

<400> 17 
cgggatccag rgaraayath ctnt ttggn 

<210> 18 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct 

<400> 18 
cggaattcnt crtchagnag rtadatrtc 
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